2 research outputs found

    A bloom-filter strategy for response time reduction in distributed query processing.

    Get PDF
    In distributed database systems, query optimization is to find strategies attempt to minimize the amount of data transmitted over the network. Optimization algorithms have an important impact on the performance of distributed query processing. Since optimal query processing in distributed database systems has been shown to be NP-Hard [WC96], heuristics are applied to find a cost-effective and efficient (but suboptimal) processing strategy. Many query optimization strategies have been proposed to minimize either the total cost or the response time. The approaches in distributed query processing have mainly focused on the use of joins, semijoins, and filters. In this thesis, we propose a new reduction strategy based on bloom-filters to significantly reduce the response time of a distributed query. This algorithm can process general queries consisting of an arbitrary number of relations and join attributes. The performance of the algorithm with respect to response time is compared against the Initial Feasible Solution (IFS). An amount of experimental results has been used to evaluate the performance of our algorithm. Compared to the IFS, our algorithm provides a significantly improved query solution. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2003 .G36. Source: Masters Abstracts International, Volume: 43-05, page: 1749. Thesis (M.Sc.)--University of Windsor (Canada), 2003

    Reduction of collisions in Bloom filters during distributed query optimization.

    Get PDF
    The goal of distributed query optimization is to find the optimal strategy for the execution of a given query. The approaches in distributed query processing have mainly focused on the use of joins, semijoins, and filters. Semijoins have the advantage over joins in that there are no increases in data sizes. However, a semijoin needs more local processing such as projection and higher data transmission. To improve the distributed query processing, the filter-based approach is utilized. One of the limitations of this approach is collisions. We investigate how collisions affect the performance of the algorithm and how performance can be improved given those collisions. Our proposed algorithm utilizes two sets of filters to reduce the collisions, so the performance has been improved when collisions exist. Our proposed algorithm is evaluated objectively by comparison to a full reducer which is the algorithm that fully reduces all relations involved in a query by eliminating all non-participating tuples from the relations. The results of the evaluation show that: (1) With a perfect hash function, on average, our algorithm eliminates 97.41% of the unneeded data and fully reduces the relations of over 70% of the queries. (2) Using a single set of filters with specific percentages of collisions, on average, less than half of a queries are fully reduced by the algorithm. Therefore, the collisions substantially affects the performance. (3) Using two sets of filters, On average, our algorithm eliminates 95% of noncontributive tuples and achieves over 60% full reduction. In conclusion, our improved algorithm utilizes the two sets of filters to reduce the effects of collisions substantially. Therefore, we improve the performance of our algorithm under the assumption of collisions which is the major problem in using Bloom filters during distributed query optimization.Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis1999 .L53. Source: Masters Abstracts International, Volume: 39-02, page: 0528. Adviser: Joan Morrissey. Thesis (M.Sc.)--University of Windsor (Canada), 1999
    corecore