Search CORE

2,376 research outputs found

A survey of parallel execution strategies for transitive closure and logic programs

Author: Cacace F.
Ceri S.
Houtsma M.A.W.
Publication venue: Kluwer Academic Publishers
Publication date: 01/01/1993
Field of study

An important feature of database technology of the nineties is the use of parallelism for speeding up the execution of complex queries. This technology is being tested in several experimental database architectures and a few commercial systems for conventional select-project-join queries. In particular, hash-based fragmentation is used to distribute data to disks under the control of different processors in order to perform selections and joins in parallel. With the development of new query languages, and in particular with the definition of transitive closure queries and of more general logic programming queries, the new dimension of recursion has been added to query processing. Recursive queries are complex; at the same time, their regular structure is particularly suited for parallel execution, and parallelism may give a high efficiency gain. We survey the approaches to parallel execution of recursive queries that have been presented in the recent literature. We observe that research on parallel execution of recursive queries is separated into two distinct subareas, one focused on the transitive closure of Relational Algebra expressions, the other one focused on optimization of more general Datalog queries. Though the subareas seem radically different because of the approach and formalism used, they have many common features. This is not surprising, because most typical Datalog queries can be solved by means of the transitive closure of simple algebraic expressions. We first analyze the relationship between the transitive closure of expressions in Relational Algebra and Datalog programs. We then review sequential methods for evaluating transitive closure, distinguishing iterative and direct methods. We address the parallelization of these methods, by discussing various forms of parallelization. Data fragmentation plays an important role in obtaining parallel execution; we describe hash-based and semantic fragmentation. Finally, we consider Datalog queries, and present general methods for parallel rule execution; we recognize the similarities between these methods and the methods reviewed previously, when the former are applied to linear Datalog queries. We also provide a quantitative analysis that shows the impact of the initial data distribution on the performance of methods

CiteSeerX

University of Twente Research Information

Algebraic optimization of recursive queries

Author: Apers Peter M.G.
Houtsma M.A.W.
Houtsma Maurice A.W.
Publication venue: North Holland
Publication date: 01/01/1992
Field of study

Over the past few years, much attention has been paid to deductive databases. They offer a logic-based interface, and allow formulation of complex recursive queries. However, they do not offer appropriate update facilities, and do not support existing applications. To overcome these problems an SQL-like interface is required besides a logic-based interface.\ud \ud In the PRISMA project we have developed a tightly-coupled distributed database, on a multiprocessor machine, with two user interfaces: SQL and PRISMAlog. Query optimization is localized in one component: the relational query optimizer. Therefore, we have defined an eXtended Relational Algebra that allows recursive query formulation and can also be used for expressing executable schedules, and we have developed algebraic optimization strategies for recursive queries. In this paper we describe an optimization strategy that rewrites regular (in the context of formal grammars) mutually recursive queries into standard Relational Algebra and transitive closure operations. We also describe how to push selections into the resulting transitive closure operations.\ud \ud The reason we focus on algebraic optimization is that, in our opinion, the new generation of advanced database systems will be built starting from existing state-of-the-art relational technology, instead of building a completely new class of systems

CiteSeerX

University of Twente Research Information

Data fragmentation for parallel transitive closure strategies

Author: Apers Peter M.G.
Houtsma M.A.W.
Houtsma Maurice A.W.
Schipper Gideon L.V.
Publication venue: IEEE
Publication date: 01/01/1993
Field of study

Addresses the problem of fragmenting a relation to make the parallel computation of the transitive closure efficient, based on the disconnection set approach. To better understand this design problem, the authors focus on transportation networks. These are characterized by loosely interconnected clusters of nodes with a high internal connectivity rate. Three requirements that have to be fulfilled by a fragmentation are formulated, and three different fragmentation strategies are presented, each emphasizing one of these requirements. Some test results are presented to show the performance of the various fragmentation strategie

University of Twente Research Information

Implementation and Performance Evaluation of a Parallel Transitive Closure Algorithms on PRISMA/DB

Author: Flokstra J.
Houtsma M.A.W.
Wilschut A.N.
Publication venue: Morgan Kaufmann Publishers Inc.
Publication date: 01/01/1992
Field of study

This paper is one of the first to discuss actual implementation of and experimentation with parallel transitive closure operations on a full-fledged relational database system. It brings two research efforts together; the development of an efficient execution strategy for parallel computation of path problems, called Disconnection Set Approach, and the development and implementation of a parallel, main-memory DBMS, called PRISMA/DB. First, we report on the implementation of the disconnection set approach on PRISMA/DB, showing how the latter's design allowed us to easily extend the functionality of the system. Second, we investigate the disconnection set approach's parallel behavior and performance by means of extensive experimentation. It is shown that the parallel implementation of the disconnection set approach yields very good performance characteristics, and that (super)linear speedup w.r.t. a special implementation of semi-naive is achieved for regular, so-called linear fragmenta..

CiteSeerX

University of Twente Research Information

Recommended from our members

Automatic view schema generation in object-oriented databases

Author: Bic Lubomir
Rundensteiner Elke A.
Publication venue: eScholarship, University of California
Publication date: 01/01/1992
Field of study

An object-oriented data schema is a complex structure of classes interrelated via generalization and property decomposition relationships. We define an object-oriented view to be a virtual schema graph with possibly restructured generalization and decomposition hierarchies - rather than just one individual virtual class as proposed in the literature. In this paper, we propose a methodology, called MultiView, for supporting multiple such view schemata. MultiView is anchored on the following complementary ideas: (a) the view definer derives virtual classes and then integrates them into one consistent global schema graph and (b) the view definer specifies arbitrarily complex view schemata on this augmented global schema. The focus of this paper is, however, on the second, less explored, issue. This part of the view definition is performed using the following two steps: (1) view class selection and (2) view schema graph generation. For the first, we have developed a view definition language that can be used by the view definer to specify the selection of the desired view classes from the global schema. For the second, we have developed two algorithms that automatically augment the set of selected view classes to generate a complete, minimal and consistent view class generalization hierarchy. The first algorithm has linear complexity but it assumes that the global schema graph is a tree. The second algorithm overcomes this restricting assumption and thus allows for multiple inheritance, but it does so at the cost of a higher complexity

eScholarship - University of California

RDF Querying

Author: A. Wilk
B. Ludäscher
B. Parsia
D. Chamberlin
E. Cohen
F. Bry
F. Bry
G. Gottlob
G. Karvounarakis
G. Karvounarakis
J. Bailey
J. Bruijn de
J. Robie
M. Kifer
M. Lacher
M. Magiridou
M. Marx
S. Abiteboul
S. Berger
T. Grust
V. Bönström
W. May
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Reactive Web systems, Web services, and Web-based publish/ subscribe systems communicate events as XML messages, and in many cases require composite event detection: it is not sufficient to react to single event messages, but events have to be considered in relation to other events that are received over time. Emphasizing language design and formal semantics, we describe the rule-based query language XChangeEQ for detecting composite events. XChangeEQ is designed to completely cover and integrate the four complementary querying dimensions: event data, event composition, temporal relationships, and event accumulation. Semantics are provided as model and fixpoint theories; while this is an established approach for rule languages, it has not been applied for event queries before

CiteSeerX

Crossref

Open Access LMU

Oxford University Research Archive

CrocoPat 2.1 Introduction and Reference Manual

Author: Beyer Dirk
Noack Andreas
Publication venue
Publication date: 01/01/2004
Field of study

CrocoPat is an efficient, powerful and easy-to-use tool for manipulating relations of arbitrary arity, including directed graphs. This manual provides an introduction to and a reference for CrocoPat and its programming language RML. It includes several application examples, in particular from the analysis of structural models of software systems.Comment: 19 pages + cover, 2 eps figures, uses llncs.cls and cs_techrpt_cover.sty, for downloading the source code, binaries, and RML examples, see http://www.software-systemtechnik.de/CrocoPat

arXiv.org e-Print Archive

CiteSeerX

AiiDA: Automated Interactive Infrastructure and Database for Computational Science

Author: Cepellotti Andrea
Kozinsky Boris
Marzari Nicola
Pizzi Giovanni
Sabatini Riccardo
Publication venue: 'Elsevier BV'
Publication date: 08/09/2015
Field of study

Computational science has seen in the last decades a spectacular rise in the scope, breadth, and depth of its efforts. Notwithstanding this prevalence and impact, it is often still performed using the renaissance model of individual artisans gathered in a workshop, under the guidance of an established practitioner. Great benefits could follow instead from adopting concepts and tools coming from computer science to manage, preserve, and share these computational efforts. We illustrate here our paradigm sustaining such vision, based around the four pillars of Automation, Data, Environment, and Sharing. We then discuss its implementation in the open-source AiiDA platform (http://www.aiida.net), that has been tuned first to the demands of computational materials science. AiiDA's design is based on directed acyclic graphs to track the provenance of data and calculations, and ensure preservation and searchability. Remote computational resources are managed transparently, and automation is coupled with data storage to ensure reproducibility. Last, complex sequences of calculations can be encoded into scientific workflows. We believe that AiiDA's design and its sharing capabilities will encourage the creation of social ecosystems to disseminate codes, data, and scientific workflows.Comment: 30 pages, 7 figure

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Recommended from our members

MultiView : a methodology for supporting multiple view schemata in object-oriented databases

Author: Rundensteiner Elke A.
Publication venue: eScholarship, University of California
Publication date: 01/01/1992
Field of study

It has been widely recognized that object-oriented database (OODB) technology needs to be extended to provide a mechanism similar to views in relational database systems. We define an object-oriented view to be an arbitrarily complex virtual schema graph with possibly restructured generalization and decomposition hierarchies - rather than just one virtual class as has been proposed in the literature. In this paper, we propose a methodology, called MultiView, for supporting multiple such view schemata. MultiView breaks the schema design task into the following independent and well-defined subtasks: (1) the customization of type descriptions and object sets of existing classes by deriving virtual classes, (2) the integration of all derived classes into one consistent global schema graph, and (3) the definition of arbitrarily complex view schemata on this augmented global schema. For the first task of MultiView, we define a set of object algebra operators that can be used by the view definer for class customization. For the second task of MultiView, we propose an algorithm that automatically integrates these newly derived virtual classes into the global schema. We solve the third task of MultiView by first letting the view definer explicitly select the desired view classes from the global schema using a view definition language and then by automatically generating a view class hierarchy for these selected classes. In addition, we present algorithms that verify the closure property of a view and, if found to be incomplete, transform it into a closed, yet minimal, view. In this paper, we introduce the fundamental concept of view independence and show MultiView to be view independent. We also outline implementation techniques for realizing MultiView with existing OODB technology

eScholarship - University of California