184 research outputs found

    Algebraic optimization of recursive queries

    Get PDF
    Over the past few years, much attention has been paid to deductive databases. They offer a logic-based interface, and allow formulation of complex recursive queries. However, they do not offer appropriate update facilities, and do not support existing applications. To overcome these problems an SQL-like interface is required besides a logic-based interface.\ud \ud In the PRISMA project we have developed a tightly-coupled distributed database, on a multiprocessor machine, with two user interfaces: SQL and PRISMAlog. Query optimization is localized in one component: the relational query optimizer. Therefore, we have defined an eXtended Relational Algebra that allows recursive query formulation and can also be used for expressing executable schedules, and we have developed algebraic optimization strategies for recursive queries. In this paper we describe an optimization strategy that rewrites regular (in the context of formal grammars) mutually recursive queries into standard Relational Algebra and transitive closure operations. We also describe how to push selections into the resulting transitive closure operations.\ud \ud The reason we focus on algebraic optimization is that, in our opinion, the new generation of advanced database systems will be built starting from existing state-of-the-art relational technology, instead of building a completely new class of systems

    Improving I/O Bandwidth for Data-Intensive Applications

    Get PDF
    High disk bandwidth in data-intensive applications is usually achieved with expensive hardware solutions consisting of a large number of disks. In this article we present our current work on software methods for improving disk bandwidth in ColumnBM, a new storage system for MonetDB/X100 query execution engine. Two novel techniques are discussed: superscalar compression for standalone queries and cooperative scans for multi-query optimization

    Perbandingan Pencarian Data Menggunakan Query Hash Join dan Query Nested Join

    Full text link
    Pengaksesan data atau pencarian data dengan menggunakan Query atau Join pada aplikasi yang terhubung dengan sebuah database perlu memperhatikan ketepatgunaan implementasi dari data itu sendiri serta waktu prosesnya. Ada banyak cara yang dapat dilakukan oleh database manajemen sistem dalam memproses dan menghasilkan jawaban sebuah query. Semua cara pada akhirnya akan menghasilkan jawaban (output) yang sama tetapi pasti mempunyai harga yang berbeda-beda, seperti kecepatan waktu untuk merespon data. Beberapa query yang sering digunakan untuk pemrosesan data yaitu Query Hash Join dan Query Nested Join, kedua query memiliki algoritma yang berbeda tapi menghasilkan output yang sama. Dengan menggunakan aplikasi yang dirancang menggunakan Microsoft Visual Studi 2010 dan Microsoft SQL Server 2008 berbasis jaringan untuk melakukan pengujian kedua algoritma atau query dengan parameter running time atau kecepatan waktu merespon data. Pengujian dilakukan dengan jumlah tabel yang dihubungkan dan jumlah baris/record. Hasil dari penelitian adalah kecepatan waktu query dalam merespon data untuk jumlah data yang kecil query hash join lebih baik dibandingkan dengan jumlah data yang besar query nested join

    Counting, Enumerating and Sampling of Execution Plans in a Cost-Based Query Optimizer

    Get PDF
    Testing an SQL database system by running large sets of deterministic or stochastic SQL statements is common practice in commercial database development. However, code defects often remain undetected as the query optimizer's choice of an execution plan is not only depending on the query but strongly influenced by a large number of parameters describing the database and the hardware environment. Modifying these parameters in order to steer the optimizer to select other plans is difficult since this means anticipating often complex search strategies implemented in the optimizer. In this paper we devise algorithms for counting, exhaustive generation, and uniform sampling of plans from the complete search space. Our techniques allow extensive validation of both generation of alternatives, and execution algorithms with plans other than the optimized one---if two candidate plans fail to produce the same results, then either the optimizer considered an invalid plan, or the execution code is faulty. When the space of alternatives becomes too large for exhaustive testing, which can occur even with a handful of joins, uniform random sampling provides a mechanism for unbiased testing. The technique is implemented in Microsoft's SQL Server, where it is an integral part of the validation and testing process

    Counting, enumerating and sampling of execution plans in a cost-based query optimizer

    Get PDF
    Testing an SQL database system by running large sets of deterministic or stochastic SQL statements is common practice in commercial database development. However, code defects often remain undetected as the query optimizer's choice of an execution plan is not only depending on the query but strongly influenced by a large number of parameters describing the database and the hardware environment. Modifying these parameters in order to steer the optimizer to select other plans is difficult since this means anticipating often complex search strategies implemented in the optimizer. In this paper we devise algorithms for counting, exhaustive generation, and uniform sampling of plans from the complete search space. Our techniques allow extensive validation of both generation of alternatives, and execution algorithms with plans other than the optimized one---if two candidate plans fail to produce the same results, then either the optimizer considered an invalid plan, or the execution code is faulty. When the space of alternatives becomes too large for exhaustive testing, which can occur even with a handful of joins, uniform random sampling provides a mechanism for unbiased testing. The technique is implemented in Microsoft's SQL Server, where it is an integral part of the validation and testing process

    Cooperative scans

    Get PDF
    Data mining, information retrieval and other application areas exhibit a query load with multiple concurrent queries touching a large fraction of a relation. This leads to individual query plans based on a table scan or large index scan. The implementation of this access path in most database systems is straightforward. The Scan operator issues next page requests to the buffer manager without concern for the system state. Conversely, the buffer manager is not aware of the work ahead and it focuses on keeping the most-recently-used pages in the buffer pool. This paper introduces cooperative scans -- a new algorithm, based on a better sharing of knowledge and responsibility between the Scan operator and the buffer manager, which significantly improves performance of concurrent scan queries. In this approach, queries share the buffer content, and progress of the scans is optimized by the buffer manager by minimizing the number of disk transfers in light of the total workload ahead. The experimental results are based on a simulation of the various disk-access scheduling policies, and implementation of the cooperative scans within PostgreSQL and MonetDB/X100. These real-life experiments show that with a little effort the performance of existing database systems on concurrent scan queries can be strongly improve

    METODE PENCARIAN DATA MENGGUNAKAN QUERY HASH JOIN DAN QUERY NESTED JOIN

    Get PDF
    Pengaksesan data atau pencarian data dengan menggunakan Query atau Join pada aplikasi yang terhubung dengan sebuah database perlu memperhatikan ketepatgunaan implementasi dari data itu sendiri serta waktu prosesnya. Ada banyak cara yang dapat dilakukan oleh database manajemen sistem dalam memproses dan menghasilkan jawaban sebuah query. Semua cara pada akhirnya akan menghasilkan jawaban (output) yang sama tetapi pasti mempunyai harga yang berbeda-beda, seperti misalnya kecepatan waktu untuk merespon data Beberapa query yang sering digunakan untuk pemrosesan data yaitu Query Hash Join dan Query Nested Join, kedua query memiliki algoritma yang berbeda tapi menghasilkan output yang sama. Dengan menggunakan aplikasi yang dirancang menggunakan Microsoft Visual Studi 2010 dan Microsoft SQL Server 2008 berbasis Jaringan untuk melakukan pengujian kedua algoritma atau query dengan paramter running time atau kecepatan waktu merespon data. Pengujian dilakukan dengan jumlah tabel yang dihubungkan dan jumlah baris/record. Hasil dari penelitian adalah kecepatan waktu query untuk merespon data untuk jumlah data yang kecil query hash join lebih baik sedangkan jumlah data yang besar query nested join lebih baik
    • …
    corecore