417 research outputs found

    Algebraic optimization of recursive queries

    Get PDF
    Over the past few years, much attention has been paid to deductive databases. They offer a logic-based interface, and allow formulation of complex recursive queries. However, they do not offer appropriate update facilities, and do not support existing applications. To overcome these problems an SQL-like interface is required besides a logic-based interface.\ud \ud In the PRISMA project we have developed a tightly-coupled distributed database, on a multiprocessor machine, with two user interfaces: SQL and PRISMAlog. Query optimization is localized in one component: the relational query optimizer. Therefore, we have defined an eXtended Relational Algebra that allows recursive query formulation and can also be used for expressing executable schedules, and we have developed algebraic optimization strategies for recursive queries. In this paper we describe an optimization strategy that rewrites regular (in the context of formal grammars) mutually recursive queries into standard Relational Algebra and transitive closure operations. We also describe how to push selections into the resulting transitive closure operations.\ud \ud The reason we focus on algebraic optimization is that, in our opinion, the new generation of advanced database systems will be built starting from existing state-of-the-art relational technology, instead of building a completely new class of systems

    Experiences with Some Benchmarks for Deductive Databases and Implementations of Bottom-Up Evaluation

    Full text link
    OpenRuleBench is a large benchmark suite for rule engines, which includes deductive databases. We previously proposed a translation of Datalog to C++ based on a method that "pushes" derived tuples immediately to places where they are used. In this paper, we report performance results of various implementation variants of this method compared to XSB, YAP and DLV. We study only a fraction of the OpenRuleBench problems, but we give a quite detailed analysis of each such task and the factors which influence performance. The results not only show the potential of our method and implementation approach, but could be valuable for anybody implementing systems which should be able to execute tasks of the discussed types.Comment: In Proceedings WLP'15/'16/WFLP'16, arXiv:1701.0014

    A performant XQuery to SQL translator

    Get PDF
    We describe a largely complete and efficient XQuery to SQL translation for XML publishing. Our translation supports the entire XQuery language, except for functions, if statements and upwards navigation axes. The system has three important properties. First, it preserves the correct XQuery semantics. This is accomplished by first translating XQuery into core-XQuery, using a complete XQuery implementation, Galax. Second, we optimize the resulting SQL queries. We develop a comprehensive framework for optimizing the XQuery to SQL translation, which is effective for a wide range of XQuery workloads. Third, our translation is platform independent. Our system achieves high degree of efficiency on a wide range of relational systems. This paper reports an extensive experimental validation on several XQuery workloads, using MySQL, PostgreSQL, and SQL Server, and compares this approach with five native XQuery engines: Galax (the newer, optimized version), Saxon, QizOpen, IMDB and Quexo

    Optimization of object query languages

    Get PDF

    Semantic optimisation in datalog programs

    Get PDF
    Bibliography: leaves 138-142.Datalog is the fusion of Prolog and Database technologies aimed at producing an efficient, logic-based, declarative language for databases. This fusion takes the best of logic programming for the syntax of Datalog, and the best of database systems for the operational part of Datalog. As is the case with all declarative languages, optimisation is necessary to improve the efficiency of programs. Semantic optimisation uses meta-knowledge describing the data in the database to optimise queries and rules, aiming to reduce the resources required to answer queries. In this thesis, I analyse prior work that has been done on semantic optimisation and then propose an optimisation system for Datalog that includes optimisation of recursive programs and a semantic knowledge management module. A language, DatalogiC, which is an extension of Datalog that allows semantic knowledge to be expressed, has also been devised as an implementation vehicle. Finally, empirical results concerning the benefits of semantic optimisation are reported

    Implementation of a XQuery engine for large documents in CanstoreX

    Get PDF
    XML is a markup language used for storing documents which contains structured information. Its flexibility helps in storing, processing and querying diverse and complex documents with any structure. While theoretically, XML could be used to handle any documents, the currently available parsers require large amounts of main-memory resulting into severe restriction on the size of XML documents. As a result, some technologies have been developed to break the XML documents in to smaller chunks and allow the parsers to load only a specific portion of the document when needed.;Two major but diagonally opposite approaches for storing an xml document on the disk have emerged. The first breaks an xml document into parent child pairs and stores them into relational storage. The second approach builds a native storage for xml that attempts to directly capture xml hierarchy. Canonical Storage for XML (CanStoreX) is a native storage technology being developed by our group at Iowa State University that has been tested for pagination of xml documents up to 100 Gigabytes in size. CanStoreX requires that every page is a self-contained xml document on its own right. Thus the pages themselves form an xml-like hierarchy.;XML can be used to encode a variety of data. Examples are system configuration, metadata, documents such as books, relational data, and object-oriented data. An array of technologies has developed to process xml documents. Our major interest in xml lies in the view that an xml document can be considered a database which can then be queried. There exists several query engines for xml. Kweelt is an excellent early platform that supports the Quilt query language. Quilt is a preliminary query language which has subsequently been extended to XQuery, a query language that has been standardized by the W3 Consortium. Quilt, the query language that Kweelt supports, is superseded by XQuery. The original Kweelt uses DOM parser; therefore it can only handle small documents. The main focus of this thesis is to deploy CanStoreX to query documents of the size of gigabytes. The resulting platform has been extensively tested

    Mapper: an efficient data transformation operator

    Get PDF
    Tese de doutoramento em Informática (Engenharia Informática), apresentada à Universidade de Lisboa através da Faculdade de Ciências, 2008Data transformations are fundamental operations in legacy data migration, data integration, data cleaning, and data warehousing. These operations are often implemented as relational queries that aim at leveraging the optimization capabilities of most DBMSs. However, relational query languages like SQL are not expressive enough to specify one-to-many data transformations, an important class of data transformations that produce several output tuples for a single input tuple. These transformations are required for solving several types of data heterogeneities, like those that occur when the source data represents aggregations of the target data. This thesis proposes a new relational operator, named data mapper, as an extension to the relational algebra to address one-to-many data transformations and focus on its optimization. It also provides algebraic rewriting rules and execution algorithms for the logical and physical optimization, respectively. As a result, queries may be expressed as a combination of standard relational operators and mappers. The proposed optimizations have been experimentally validated and the key factors that influence the obtained performance gains identified. Keywords: Relational Algebra, Data Transformation, Data Integration, Data Cleaning, Data WarehousingAs transformações de dados são operações fundamentais em processos de migração de dados de sistemas legados, integração de dados, limpeza de dados e ao refrescamento de Data Warehouses. Usualmente, estas operações são implementadas através de interrogações relacionais por forma a explorar as optimizações proporcionadas pela maioria dos SGBDs. No entanto, as linguagens de interrogação relacionais, como o SQL, não são suficientemente expressivas para especificar as transformações de dados do tipo um-para-muitos. Esta importante classe de transformações é necessária para resolver de forma adequada diversos tipos de heterogeneidades de dados tais como as que decorrem de situações em que os dados do esquema origem representam uma agregação dos dados do sistema destino. Esta tese propõe a extensão da álgebra relacional com um novo operador relacional denominado data mapper, por forma a permitir a especificação e optimização de transformações de dados um-para-muitos. O trabalho apresenta regras de reescrita algébrica juntamente com diversos algoritmos de execução que proporcionam, respectivamente, a optimização lógica e física de transformações de dados um-para-muitos. Como resultado, é possivel optimizar transformações de dados que combinem operadores relacionais comuns com data mappers. As optimizações propostas foram validadas experimentalmente e identificados os factores que influênciam os seus respectivos ganhos
    corecore