4 research outputs found

    Multi-route query processing and optimization

    Get PDF
    A modern query optimizer typically picks a single query plan for all data based on overall data statistics. However, many have observed that real-life datasets tend to have non-uniform distributions. Selecting a single query plan may result in ineffective query execution for possibly large portions of the actual data. In addition most stream query processing systems, given the volume of data, cannot precisely model the system state much less account for uncertainty due to continuous variations. Such systems select a single query plan based upon imprecise statistics. In this paper, we present "Query Mesh" (or QM), a practical alternative to state-of-the-art data stream processing approaches. The main idea of QM is to compute multiple routes (i.e., query plans), each designed for a particular subset of the data with distinct statistical properties. We use terms "plans" and "routes" interchangeably in our work. A classifier model is induced and used to assign the best route to process incoming tuples based upon their data characteristics. We formulate the QM search space and analyze its complexity. Due to the substantial search space, we propose several cost-based query optimization heuristics designed to effectively find nearly optimal QMs. We propose the Self-Routing Fabric (SRF) infrastructure that supports query execution with multiple plans without physically constructing their topologies nor using a central router like Eddy. We also consider how to support uncertain route specification and execution in QM which can occur when imprecise statistics lead to more than one optimal route for a subset of data. Our experimental results indicate that QM consistently provides better query execution performance and incurs negligible overhead compared to the alternative state-of-the-art data stream approaches

    Robust Query Optimization Methods With Respect to Estimation Errors: A Survey

    Get PDF
    International audienceThe quality of a query execution plan chosen by a Cost-Based Optimizer (CBO) depends greatly on the estimation accuracy of input parameter values. Many research results have been produced on improving the estimation accuracy, but they do not work for every situation. Therefore, "robust query optimization" was introduced, in an effort to minimize the sub-optimality risk by accepting the fact that estimates could be inaccurate. In this survey, we aim to provide an overview of robust query optimization methods by classifying them into different categories, explaining the essential ideas, listing their advantages and limitations, and comparing them with multiple criteria

    Multi-route query processing and optimization

    No full text
    A modern query optimizer typically picks a single query plan for all data based on overall data statistics. However, many have observed that real-life datasets tend to have non-uniform distributions. Selecting a single query plan may result in ineffective query execution for possibly large portions of the actual data. In addition most stream query processing systems, given the volume of data, cannot precisely model the system state much less account for uncertainty due to continuous variations. Such systems select a single query plan based upon imprecise statistics. In this paper, we present “Query Mesh” (or QM), a practical alternative to state-of-the-art data stream processing approaches. The main idea of QM is to compute multiple routes (i.e., query plans), each designed for a particular subset of the data with distinct statistical properties. We use terms “plans” and “routes” interchangeably in our work. A classifier model is induced and used to assign the best route to process incoming tuples based upon their data characteristics. We formulate the QM search space and analyze its complexity. Due to the substantial search space, we propose several cost-based query optimization heuristics designed to effectively find nearly optimal QMs. We propose the Self-Routing Fabric (SRF) infrastructure that supports query execution with multiple plans without physically constructing their topologies nor using a central router like Eddy. We also consider how to support uncertain route specification and execution in QM which can occur when imprecise statistics lead to more than one optimal route for a subset of data. Our experimental results indicate that QM consistently provides better query execution performance and incurs negligible overhead compared to the alternative state-of-the-art data stream approaches
    corecore