11,954 research outputs found

    Deconstructing yield operator to enhance streams processing

    Get PDF
    Este trabalho foi financiado pelo Concurso Anual para Projetos de Investigação, Desenvolvimento, Inovação e Criação Artística (IDI&CA) 2020 do Instituto Politécnico de Lisboa. Código de referência IPL/2020/WebFluid/ISELCustomizing streams pipelines with new user-defined operations is a well-known pattern regarding streams processing. However, programming languages face two challenges when considering streams extensibility: 1) provide a compact and readable way to express new operations, and 2) keep streams’ laziness behavior. From here, we may find a consensus around the adoption of the generator operator, i.e. yield, as a means to fulfil both requirements, since most state-of-the-art programming languages provide this feature. Yet, what is the performance overhead of interleaving a yield-based operation in streams processing? In this work we present a benchmark based on realistic use cases of two different web APIs, namely: Last.fm and world weather on line, where custom yield-based operations may degrade the streams performance in twofold. We also propose a purely functional and minimalistic design, named tinyield, that can be easily adopted in any programming language and provides a concise way of chaining extension operations fluently, with low overhead in the eval uated benchmarks. The tinyield proposal was deployed in three different libraries, namely for Java (jayield), JavaScript (tinyield4ts) and .Net (tinyield4net).info:eu-repo/semantics/publishedVersio

    Optimizing Abstract Abstract Machines

    Full text link
    The technique of abstracting abstract machines (AAM) provides a systematic approach for deriving computable approximations of evaluators that are easily proved sound. This article contributes a complementary step-by-step process for subsequently going from a naive analyzer derived under the AAM approach, to an efficient and correct implementation. The end result of the process is a two to three order-of-magnitude improvement over the systematically derived analyzer, making it competitive with hand-optimized implementations that compute fundamentally less precise results.Comment: Proceedings of the International Conference on Functional Programming 2013 (ICFP 2013). Boston, Massachusetts. September, 201

    Stream Fusion, to Completeness

    Full text link
    Stream processing is mainstream (again): Widely-used stream libraries are now available for virtually all modern OO and functional languages, from Java to C# to Scala to OCaml to Haskell. Yet expressivity and performance are still lacking. For instance, the popular, well-optimized Java 8 streams do not support the zip operator and are still an order of magnitude slower than hand-written loops. We present the first approach that represents the full generality of stream processing and eliminates overheads, via the use of staging. It is based on an unusually rich semantic model of stream interaction. We support any combination of zipping, nesting (or flat-mapping), sub-ranging, filtering, mapping-of finite or infinite streams. Our model captures idiosyncrasies that a programmer uses in optimizing stream pipelines, such as rate differences and the choice of a "for" vs. "while" loops. Our approach delivers hand-written-like code, but automatically. It explicitly avoids the reliance on black-box optimizers and sufficiently-smart compilers, offering highest, guaranteed and portable performance. Our approach relies on high-level concepts that are then readily mapped into an implementation. Accordingly, we have two distinct implementations: an OCaml stream library, staged via MetaOCaml, and a Scala library for the JVM, staged via LMS. In both cases, we derive libraries richer and simultaneously many tens of times faster than past work. We greatly exceed in performance the standard stream libraries available in Java, Scala and OCaml, including the well-optimized Java 8 streams

    Measuring NUMA effects with the STREAM benchmark

    Full text link
    Modern high-end machines feature multiple processor packages, each of which contains multiple independent cores and integrated memory controllers connected directly to dedicated physical RAM. These packages are connected via a shared bus, creating a system with a heterogeneous memory hierarchy. Since this shared bus has less bandwidth than the sum of the links to memory, aggregate memory bandwidth is higher when parallel threads all access memory local to their processor package than when they access memory attached to a remote package. But, the impact of this heterogeneous memory architecture is not easily understood from vendor benchmarks. Even where these measurements are available, they provide only best-case memory throughput. This work presents a series of modifications to the well-known STREAM benchmark to measure the effects of NUMA on both a 48-core AMD Opteron machine and a 32-core Intel Xeon machine

    06472 Abstracts Collection - XQuery Implementation Paradigms

    Get PDF
    From 19.11.2006 to 22.11.2006, the Dagstuhl Seminar 06472 ``XQuery Implementation Paradigms'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available
    • …
    corecore