72,415 research outputs found
Deconstructing yield operator to enhance streams processing
Este trabalho foi financiado pelo Concurso Anual para Projetos de Investigação, Desenvolvimento, Inovação e Criação Artística (IDI&CA) 2020 do Instituto Politécnico de Lisboa. Código de referência IPL/2020/WebFluid/ISELCustomizing streams pipelines with new user-defined operations is a well-known pattern regarding streams processing. However, programming languages face two challenges when considering streams extensibility: 1) provide a compact and readable way to express new operations, and 2) keep streams’ laziness behavior. From here, we may find a consensus around the adoption of the generator operator, i.e. yield, as a means to fulfil both requirements, since most state-of-the-art programming languages provide this feature. Yet, what is the performance overhead of interleaving a yield-based operation in streams processing? In this work we present a benchmark based on realistic use cases of two different web APIs, namely: Last.fm and world weather on line, where custom yield-based operations may degrade the streams performance in twofold. We also propose a purely functional and minimalistic design, named tinyield, that can be easily adopted in any programming language and provides a concise way of chaining extension operations fluently, with low overhead in the eval uated benchmarks. The tinyield proposal was deployed in three different libraries, namely for Java (jayield), JavaScript (tinyield4ts) and .Net (tinyield4net).info:eu-repo/semantics/publishedVersio
Lambda Calculus in Core Aldwych
Core Aldwych is a simple model for concurrent computation, involving the concept of agents which communicate through shared variables. Each variable will have exactly one agent that can write to it, and its value can never be changed once written, but a value can contain further variables which are written to later. A key aspect is that the reader of a value may become the writer of variables in it. In this paper we show how this model can be used to encode lambda calculus. Individual function applications can be explicitly encoded as lazy or not, as required. We then show how this encoding can be extended to cover functions which manipulate mutable variables, but with the underlying Core Aldwych implementation still using only immutable variables. The ordering of function applications then becomes an issue, with Core Aldwych able to model either the enforcement of an ordering or the retention of indeterminate ordering, which allows parallel execution
Representations of stream processors using nested fixed points
We define representations of continuous functions on infinite streams of discrete values, both in the case of discrete-valued functions, and in the case of stream-valued functions. We define also an operation on the representations of two continuous functions between streams that yields a representation of their composite. In the case of discrete-valued functions, the representatives are well-founded (finite-path) trees of a certain kind. The underlying idea can be traced back to Brouwer's justification of bar-induction, or to Kreisel and Troelstra's elimination of choice-sequences. In the case of stream-valued functions, the representatives are non-wellfounded trees pieced together in a coinductive fashion from well-founded trees. The definition requires an alternating fixpoint construction of some ubiquity
Probabilistic Graphical Models on Multi-Core CPUs using Java 8
In this paper, we discuss software design issues related to the development
of parallel computational intelligence algorithms on multi-core CPUs, using the
new Java 8 functional programming features. In particular, we focus on
probabilistic graphical models (PGMs) and present the parallelisation of a
collection of algorithms that deal with inference and learning of PGMs from
data. Namely, maximum likelihood estimation, importance sampling, and greedy
search for solving combinatorial optimisation problems. Through these concrete
examples, we tackle the problem of defining efficient data structures for PGMs
and parallel processing of same-size batches of data sets using Java 8
features. We also provide straightforward techniques to code parallel
algorithms that seamlessly exploit multi-core processors. The experimental
analysis, carried out using our open source AMIDST (Analysis of MassIve Data
STreams) Java toolbox, shows the merits of the proposed solutions.Comment: Pre-print version of the paper presented in the special issue on
Computational Intelligence Software at IEEE Computational Intelligence
Magazine journa
Stream Fusion, to Completeness
Stream processing is mainstream (again): Widely-used stream libraries are now
available for virtually all modern OO and functional languages, from Java to C#
to Scala to OCaml to Haskell. Yet expressivity and performance are still
lacking. For instance, the popular, well-optimized Java 8 streams do not
support the zip operator and are still an order of magnitude slower than
hand-written loops. We present the first approach that represents the full
generality of stream processing and eliminates overheads, via the use of
staging. It is based on an unusually rich semantic model of stream interaction.
We support any combination of zipping, nesting (or flat-mapping), sub-ranging,
filtering, mapping-of finite or infinite streams. Our model captures
idiosyncrasies that a programmer uses in optimizing stream pipelines, such as
rate differences and the choice of a "for" vs. "while" loops. Our approach
delivers hand-written-like code, but automatically. It explicitly avoids the
reliance on black-box optimizers and sufficiently-smart compilers, offering
highest, guaranteed and portable performance. Our approach relies on high-level
concepts that are then readily mapped into an implementation. Accordingly, we
have two distinct implementations: an OCaml stream library, staged via
MetaOCaml, and a Scala library for the JVM, staged via LMS. In both cases, we
derive libraries richer and simultaneously many tens of times faster than past
work. We greatly exceed in performance the standard stream libraries available
in Java, Scala and OCaml, including the well-optimized Java 8 streams
A Comparison of Big Data Frameworks on a Layered Dataflow Model
In the world of Big Data analytics, there is a series of tools aiming at
simplifying programming applications to be executed on clusters. Although each
tool claims to provide better programming, data and execution models, for which
only informal (and often confusing) semantics is generally provided, all share
a common underlying model, namely, the Dataflow model. The Dataflow model we
propose shows how various tools share the same expressiveness at different
levels of abstraction. The contribution of this work is twofold: first, we show
that the proposed model is (at least) as general as existing batch and
streaming frameworks (e.g., Spark, Flink, Storm), thus making it easier to
understand high-level data-processing applications written in such frameworks.
Second, we provide a layered model that can represent tools and applications
following the Dataflow paradigm and we show how the analyzed tools fit in each
level.Comment: 19 pages, 6 figures, 2 tables, In Proc. of the 9th Intl Symposium on
High-Level Parallel Programming and Applications (HLPP), July 4-5 2016,
Muenster, German
- …