Search CORE

3,769 research outputs found

Extending a multi-set relational algebra to a parallel environment

Author: Flokstra J.
Grefen P.W.P.J.
Publication venue: Springer Verlag
Publication date: 01/01/1996
Field of study

Parallel database systems will very probably be the future for high-performance data-intensive applications. In the past decade, many parallel database systems have been developed, together with many languages and approaches to specify operations in these systems. A common background is still missing, however. This paper proposes an extended relational algebra for this purpose, based on the well-known standard relational algebra. The extended algebra provides both complete database manipulation language features, and data distribution and process allocation primitives to describe parallelism. It is defined in terms of multi-sets of tuples to allow handling of duplicates and to obtain a close connection to the world of high-performance data processing. Due to its algebraic nature, the language is well suited for optimization and parallelization through expression rewriting. The proposed language can be used as a database manipulation language on its own, as has been done in the PRISMA parallel database project, or as a formal basis for other languages, like SQL

University of Twente Research Information

A Data Transformation System for Biological Data Sources

Author: Buneman Peter
Davidson Susan
Hart Kyle
Overton Chris
Wong L.
Publication venue
Publication date: 01/01/1995
Field of study

Scientific data of importance to biologists in the Human Genome Project resides not only in conventional databases, but in structured files maintained in a number of different formats (e.g. ASN.1 and ACE) as well a.s sequence analysis packages (e.g. BLAST and FASTA). These formats and packages contain a number of data types not found in conventional databases, such as lists and variants, and may be deeply nested. We present in this paper techniques for querying and transforming such data, and illustrate their use in a prototype system developed in conjunction with the Human Genome Center for Chromosome 22. We also describe optimizations performed by the system, a crucial issue for bulk data

CiteSeerX

Edinburgh Research Explorer

ScholarlyCommons@Penn

Algebraic optimization of recursive queries

Author: Apers Peter M.G.
Houtsma M.A.W.
Houtsma Maurice A.W.
Publication venue: North Holland
Publication date: 01/01/1992
Field of study

Over the past few years, much attention has been paid to deductive databases. They offer a logic-based interface, and allow formulation of complex recursive queries. However, they do not offer appropriate update facilities, and do not support existing applications. To overcome these problems an SQL-like interface is required besides a logic-based interface.\ud \ud In the PRISMA project we have developed a tightly-coupled distributed database, on a multiprocessor machine, with two user interfaces: SQL and PRISMAlog. Query optimization is localized in one component: the relational query optimizer. Therefore, we have defined an eXtended Relational Algebra that allows recursive query formulation and can also be used for expressing executable schedules, and we have developed algebraic optimization strategies for recursive queries. In this paper we describe an optimization strategy that rewrites regular (in the context of formal grammars) mutually recursive queries into standard Relational Algebra and transitive closure operations. We also describe how to push selections into the resulting transitive closure operations.\ud \ud The reason we focus on algebraic optimization is that, in our opinion, the new generation of advanced database systems will be built starting from existing state-of-the-art relational technology, instead of building a completely new class of systems

CiteSeerX

University of Twente Research Information

Manipulation of expressions in a relational algebra

Author: Engmann R.
Stroet J.W.M.
Publication venue: Pergamon Press
Publication date: 01/01/1979
Field of study

This paper describes a syntax for expressions based on the relational algebra. A tree representation is generated when an expression is analyzed. Transformations on the tree representations of expressions are applied in order to obtain improvements with respect to the speed of evaluation in a data base environment

University of Twente Research Information

Deductive Optimization of Relational Data Storage

Author: Feser John K.
Madden Samuel
Solar-Lezama Armando
Tang Nan
Publication venue
Publication date: 05/02/2020
Field of study

Optimizing the physical data storage and retrieval of data are two key database management problems. In this paper, we propose a language that can express a wide range of physical database layouts, going well beyond the row- and column-based methods that are widely used in database management systems. We use deductive synthesis to turn a high-level relational representation of a database query into a highly optimized low-level implementation which operates on a specialized layout of the dataset. We build a compiler for this language and conduct experiments using a popular database benchmark, which shows that the performance of these specialized queries is competitive with a state-of-the-art in memory compiled database system

arXiv.org e-Print Archive

DSpace@MIT

PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development

Author: Barnett R. Matthew
Jermaine Chris
Lorido-Botran Tania
Luo Shangyu
Monroy Carlos
Sikdar Sourav
Teymourian Kia
Yuan Binhang
Zou Jia
Publication venue
Publication date: 01/01/2017
Field of study

This paper describes PlinyCompute, a system for development of high-performance, data-intensive, distributed computing tools and libraries. In the large, PlinyCompute presents the programmer with a very high-level, declarative interface, relying on automatic, relational-database style optimization to figure out how to stage distributed computations. However, in the small, PlinyCompute presents the capable systems programmer with a persistent object data model and API (the "PC object model") and associated memory management system that has been designed from the ground-up for high performance, distributed, data-intensive computing. This contrasts with most other Big Data systems, which are constructed on top of the Java Virtual Machine (JVM), and hence must at least partially cede performance-critical concerns such as memory management (including layout and de/allocation) and virtual method/function dispatch to the JVM. This hybrid approach---declarative in the large, trusting the programmer's ability to utilize PC object model efficiently in the small---results in a system that is ideal for the development of reusable, data-intensive tools and libraries. Through extensive benchmarking, we show that implementing complex objects manipulation and non-trivial, library-style computations on top of PlinyCompute can result in a speedup of 2x to more than 50x or more compared to equivalent implementations on Spark.Comment: 48 pages, including references and Appendi

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)

Flattening an object algebra to provide performance

Author: Boncz P.
Kersten M.L.
Wilschut A.N.
Publication venue: IEEE Computer Society
Publication date: 01/01/1998
Field of study

Algebraic transformation and optimization techniques have been the method of choice in relational query execution, but applying them in object-oriented (OO) DBMSs is difficult due to the complexity of OO query languages. This paper demonstrates that the problem can be simplified by mapping an OO data model to the binary relational model implemented by Monet, a state-of-the-art database kernel. We present a generic mapping scheme to flatten data models and study the case of straightforward OO model. We show how flattening enabled us to implement a query algebra, using only a very limited set of simple operations. The required primitives and query execution strategies are discussed, and their performance is evaluated on the 1-GByte TPC-D (Transaction-processing Performance Council's Benchmark D), showing that our divide-and-conquer approach yields excellent result

CiteSeerX

CWI's Institutional Repository

University of Twente Research Information

International Migration, Integration and Social Cohesion online publications

06472 Abstracts Collection - XQuery Implementation Paradigms

Author: Boncz Peter A.
Grust Torsten
Keulen Maurice van
Siméon Jerome
Publication venue: Internationales Begegnungs- und Forschungszentrum fuer Informatik (IBFI)
Publication date: 01/01/2007
Field of study

From 19.11.2006 to 22.11.2006, the Dagstuhl Seminar 06472 ``XQuery Implementation Paradigms'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

CWI's Institutional Repository

Dagstuhl Research Online Publication Server

University of Twente Research Information

Implementing PRISMA/DB in an OOPL

Author: Apers P.M.G.
Grefen P.W.P.J.
Kersten M.L.
Wilschut A.N.
Publication venue: Springer
Publication date: 01/01/1989
Field of study

PRISMA/DB is implemented in a parallel object-oriented language to gain insight in the usage of parallelism. This environment allows us to experiment with parallelism by simply changing the allocation of objects to the processors of the PRISMA machine. These objects are obtained by a strictly modular design of PRISMA/DB. Communication between the objects is required to cooperatively handle the various tasks, but it limits the potential for parallelism. From this approach, we hope to gain a better understanding of parallelism, which can be used to enhance the performance of PRISMA/DB.\ud The work reported in this document was conducted as part of the PRISMA project, a joint effort with Philips Research Eindhoven, partially supported by the Dutch "Stimuleringsprojectteam Informaticaonderzoek (SPIN)

University of Twente Research Information