72,415 research outputs found

    Deconstructing yield operator to enhance streams processing

    Get PDF
    Este trabalho foi financiado pelo Concurso Anual para Projetos de Investigação, Desenvolvimento, Inovação e Criação Artística (IDI&CA) 2020 do Instituto Politécnico de Lisboa. Código de referência IPL/2020/WebFluid/ISELCustomizing streams pipelines with new user-defined operations is a well-known pattern regarding streams processing. However, programming languages face two challenges when considering streams extensibility: 1) provide a compact and readable way to express new operations, and 2) keep streams’ laziness behavior. From here, we may find a consensus around the adoption of the generator operator, i.e. yield, as a means to fulfil both requirements, since most state-of-the-art programming languages provide this feature. Yet, what is the performance overhead of interleaving a yield-based operation in streams processing? In this work we present a benchmark based on realistic use cases of two different web APIs, namely: Last.fm and world weather on line, where custom yield-based operations may degrade the streams performance in twofold. We also propose a purely functional and minimalistic design, named tinyield, that can be easily adopted in any programming language and provides a concise way of chaining extension operations fluently, with low overhead in the eval uated benchmarks. The tinyield proposal was deployed in three different libraries, namely for Java (jayield), JavaScript (tinyield4ts) and .Net (tinyield4net).info:eu-repo/semantics/publishedVersio

    Lambda Calculus in Core Aldwych

    Get PDF
    Core Aldwych is a simple model for concurrent computation, involving the concept of agents which communicate through shared variables. Each variable will have exactly one agent that can write to it, and its value can never be changed once written, but a value can contain further variables which are written to later. A key aspect is that the reader of a value may become the writer of variables in it. In this paper we show how this model can be used to encode lambda calculus. Individual function applications can be explicitly encoded as lazy or not, as required. We then show how this encoding can be extended to cover functions which manipulate mutable variables, but with the underlying Core Aldwych implementation still using only immutable variables. The ordering of function applications then becomes an issue, with Core Aldwych able to model either the enforcement of an ordering or the retention of indeterminate ordering, which allows parallel execution

    Representations of stream processors using nested fixed points

    Get PDF
    We define representations of continuous functions on infinite streams of discrete values, both in the case of discrete-valued functions, and in the case of stream-valued functions. We define also an operation on the representations of two continuous functions between streams that yields a representation of their composite. In the case of discrete-valued functions, the representatives are well-founded (finite-path) trees of a certain kind. The underlying idea can be traced back to Brouwer's justification of bar-induction, or to Kreisel and Troelstra's elimination of choice-sequences. In the case of stream-valued functions, the representatives are non-wellfounded trees pieced together in a coinductive fashion from well-founded trees. The definition requires an alternating fixpoint construction of some ubiquity

    Probabilistic Graphical Models on Multi-Core CPUs using Java 8

    Get PDF
    In this paper, we discuss software design issues related to the development of parallel computational intelligence algorithms on multi-core CPUs, using the new Java 8 functional programming features. In particular, we focus on probabilistic graphical models (PGMs) and present the parallelisation of a collection of algorithms that deal with inference and learning of PGMs from data. Namely, maximum likelihood estimation, importance sampling, and greedy search for solving combinatorial optimisation problems. Through these concrete examples, we tackle the problem of defining efficient data structures for PGMs and parallel processing of same-size batches of data sets using Java 8 features. We also provide straightforward techniques to code parallel algorithms that seamlessly exploit multi-core processors. The experimental analysis, carried out using our open source AMIDST (Analysis of MassIve Data STreams) Java toolbox, shows the merits of the proposed solutions.Comment: Pre-print version of the paper presented in the special issue on Computational Intelligence Software at IEEE Computational Intelligence Magazine journa

    Stream Fusion, to Completeness

    Full text link
    Stream processing is mainstream (again): Widely-used stream libraries are now available for virtually all modern OO and functional languages, from Java to C# to Scala to OCaml to Haskell. Yet expressivity and performance are still lacking. For instance, the popular, well-optimized Java 8 streams do not support the zip operator and are still an order of magnitude slower than hand-written loops. We present the first approach that represents the full generality of stream processing and eliminates overheads, via the use of staging. It is based on an unusually rich semantic model of stream interaction. We support any combination of zipping, nesting (or flat-mapping), sub-ranging, filtering, mapping-of finite or infinite streams. Our model captures idiosyncrasies that a programmer uses in optimizing stream pipelines, such as rate differences and the choice of a "for" vs. "while" loops. Our approach delivers hand-written-like code, but automatically. It explicitly avoids the reliance on black-box optimizers and sufficiently-smart compilers, offering highest, guaranteed and portable performance. Our approach relies on high-level concepts that are then readily mapped into an implementation. Accordingly, we have two distinct implementations: an OCaml stream library, staged via MetaOCaml, and a Scala library for the JVM, staged via LMS. In both cases, we derive libraries richer and simultaneously many tens of times faster than past work. We greatly exceed in performance the standard stream libraries available in Java, Scala and OCaml, including the well-optimized Java 8 streams

    A Comparison of Big Data Frameworks on a Layered Dataflow Model

    Get PDF
    In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models, for which only informal (and often confusing) semantics is generally provided, all share a common underlying model, namely, the Dataflow model. The Dataflow model we propose shows how various tools share the same expressiveness at different levels of abstraction. The contribution of this work is twofold: first, we show that the proposed model is (at least) as general as existing batch and streaming frameworks (e.g., Spark, Flink, Storm), thus making it easier to understand high-level data-processing applications written in such frameworks. Second, we provide a layered model that can represent tools and applications following the Dataflow paradigm and we show how the analyzed tools fit in each level.Comment: 19 pages, 6 figures, 2 tables, In Proc. of the 9th Intl Symposium on High-Level Parallel Programming and Applications (HLPP), July 4-5 2016, Muenster, German
    corecore