151 research outputs found

    PrologPF: Parallel Logic and Functions on the Delphi Machine

    Get PDF
    PrologPF is a parallelising compiler targeting a distributed system of general purpose workstations connected by a relatively low performance network. The source language extends standard Prolog with the integration of higher-order functions. The execution of a compiled PrologPF program proceeds in a similar manner to standard Prolog, but uses oracles in one of two modes. An oracle represents the sequence of clauses used to reach a given point in the problem search tree, and the same PrologPF executable can be used to build oracles, or follow oracles previously generated. The parallelisation strategy used by PrologPF proceeds in two phases, which this research shows can be interleaved. An initial phase searches the problem tree to a limited depth, recording the discovered incomplete paths. In the second phase these paths are allocated to the available processors in the network. Each processor follows its assigned paths and fully searches the referenced subtree, sending solutions back to a control processor. This research investigates the use of the technique with a one-time partitioning of the problem and no further scheduling communication, and with the recursive application of the partitioning technique to effect dynamic work reassignment. For a problem requiring all solutions to be found, execution completes when all the distributed processors have completed the search of their assigned subtrees. If one solution is required, the execution of all the path processors is terminated when the control processor receives the first solution. The presence of the extra-logical Prolog predicate cut in the user program conflicts with the use of oracles to represent valid open subtrees. PrologPF promotes the use of higher-order functional programming as an alternative to the use of cut. The combined language shows that functional support can be added as a consistent extension to standard Prolog

    Acta Cybernetica : Volume 16. Number 1.

    Get PDF

    Algebraic incremental maintenance of XML views

    Get PDF
    International audienceMaterialized views can bring important performance benefits when querying XML documents. In the presence of XML document changes, materialized views need to be updated to faithfully reflect the changed document. In this work, we present an algebraic approach for propagating source updates to XML materialized views expressed in a powerful XML tree pattern formalism. Our approach differs from the state of the art in the area in two important ways. First, it relies on set-oriented, algebraic operations, to be contrasted with node-based previous approaches. Second, it exploits state-of-the-art features of XML stores and XML query evaluation engines, notably XML structural identifiers and associated structural join algorithms. We present algorithms for determining how updates should be propagated to views, and highlight the benefits of our approach over existing algorithms through a series of experiments

    K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources

    Get PDF
    The integration of heterogeneous data sources and software systems is a major issue in the biomed ical community and several approaches have been explored: linking databases, on-the- fly integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear winner . Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application

    Incremental Processing and Optimization of Update Streams

    Get PDF
    Over the recent years, we have seen an increasing number of applications in networking, sensor networks, cloud computing, and environmental monitoring, which monitor, plan, control, and make decisions over data streams from multiple sources. We are interested in extending traditional stream processing techniques to meet the new challenges of these applications. Generally, in order to support genuine continuous query optimization and processing over data streams, we need to systematically understand how to address incremental optimization and processing of update streams for a rich class of queries commonly used in the applications. Our general thesis is that efficient incremental processing and re-optimization of update streams can be achieved by various incremental view maintenance techniques if we cast the problems as incremental view maintenance problems over data streams. We focus on two incremental processing of update streams challenges currently not addressed in existing work on stream query processing: incremental processing of transitive closure queries over data streams, and incremental re-optimization of queries. In addition to addressing these specific challenges, we also develop a working prototype system Aspen, which serves as an end-to-end stream processing system that has been deployed as the foundation for a case study of our SmartCIS application. We validate our solutions both analytically and empirically on top of our prototype system Aspen, over a variety of benchmark workloads such as TPC-H and LinearRoad Benchmarks

    Compilation Techniques for Incremental Collection Processing

    Get PDF
    Many map-reduce frameworks as well as NoSQL systems rely on collection programming as their interface of choice due to its rich semantics along with an easily parallelizable set of primitives. Unfortunately, the potential of collection programming is not entirely fulfilled by current systems as they lack efficient incremental view maintenance (IVM) techniques for queries producing large nested results. This comes as a consequence of the fact that the nesting of collections does not enjoy the same algebraic properties underscoring the optimization potential of typical collection processing constructs. We propose the first solution for the efficient incrementalization of collection programming in terms of its core constructs as captured by the positive nested relational calculus (NRC+) on bags (with integer multiplicities). We take an approach based on delta query derivation, whose goal is to generate delta queries which, given a small change in the input, can update the materialized view more efficiently than via recomputation. More precisely, we model the cost of NRC+ operators and classify queries as efficiently incrementalizable if their delta has a strictly lower cost than full re-evaluation. Then, we identify IncNRC+, a large fragment of NRC+ that is efficiently incrementalizable and we provide a semantics-preserving translation that takes any NRC+ query to a collection of IncNRC+ queries. Furthermore, we prove that incrementalmaintenance for NRC+ is within the complexity class NC0 and we showcase how Recursive IVM, a technique that has provided significant speedups over traditional IVM in the case of flat queries, can also be applied to IncNRC+ . Existing systems are also limited wrt. the size of inner collections that they can effectively handle before running into severe performance bottlenecks. In particular, in the face of nested collections with skewed cardinalities developers typically have to undergo a painful process of manual query re-writes in order to ensure that the largest inner collections in their workloads are not impacted by these limitations. To address these issues we developed SLeNDer, a compilation framework that given a nested query generates a set of semantically equivalent (partially) shredded queries that can be efficiently evaluated and incrementalized using state of the art techniques for handling skew and applying delta changes, respectively. The derived queries expose nested collections to the same opportunities for distributing their processing and incrementally updating their contents as those enjoyed by top-level collections, leading on our benchmark to up to 16.8x and 21.9x speedups in terms of offline and online processing, respectively. In order to enable efficient IVM for the increasingly common case of collection programming with functional values as in Links, we also discuss the efficient incrementalization of simplytyped lambda calculi, under the constraint that their primitives are themselves efficiently incrementalizable

    DBToaster: higher-order delta processing for dynamic, frequently fresh views

    Get PDF
    Applications ranging from algorithmic trading to scientific data analysis require real-time analytics based on views over databases receiving thousands of updates each second. Such views have to be kept fresh at millisecond latencies. At the same time, these views have to support classical SQL, rather than window semantics, to enable applications that combine current with aged or historical data. In this article, we present the DBToaster system, which keeps materialized views of standard SQL queries continuously fresh as data changes very rapidly. This is achieved by a combination of aggressive compilation techniques and DBToaster's original recursive finite differencing technique which materializes a query and a set of its higher-order deltas as views. These views support each other's incremental maintenance, leading to a reduced overall view maintenance cost. DBToaster supports tens of thousands of complete view refreshes per second for a wide range of queries
    • …
    corecore