193,046 research outputs found

    Neo: A Learned Query Optimizer

    Full text link
    Query optimization is one of the most challenging problems in database systems. Despite the progress made over the past decades, query optimizers remain extremely complex components that require a great deal of hand-tuning for specific workloads and datasets. Motivated by this shortcoming and inspired by recent advances in applying machine learning to data management challenges, we introduce Neo (Neural Optimizer), a novel learning-based query optimizer that relies on deep neural networks to generate query executions plans. Neo bootstraps its query optimization model from existing optimizers and continues to learn from incoming queries, building upon its successes and learning from its failures. Furthermore, Neo naturally adapts to underlying data patterns and is robust to estimation errors. Experimental results demonstrate that Neo, even when bootstrapped from a simple optimizer like PostgreSQL, can learn a model that offers similar performance to state-of-the-art commercial optimizers, and in some cases even surpass them

    Plug & Test at System Level via Testable TLM Primitives

    Get PDF
    With the evolution of Electronic System Level (ESL) design methodologies, we are experiencing an extensive use of Transaction-Level Modeling (TLM). TLM is a high-level approach to modeling digital systems where details of the communication among modules are separated from the those of the implementation of functional units. This paper represents a first step toward the automatic insertion of testing capabilities at the transaction level by definition of testable TLM primitives. The use of testable TLM primitives should help designers to easily get testable transaction level descriptions implementing what we call a "Plug & Test" design methodology. The proposed approach is intended to work both with hardware and software implementations. In particular, in this paper we will focus on the design of a testable FIFO communication channel to show how designers are given the freedom of trading-off complexity, testability levels, and cos

    Stochastic Database Cracking: Towards Robust Adaptive Indexing in Main-Memory Column-Stores

    Get PDF
    Modern business applications and scientific databases call for inherently dynamic data storage environments. Such environments are characterized by two challenging features: (a) they have little idle system time to devote on physical design; and (b) there is little, if any, a priori workload knowledge, while the query and data workload keeps changing dynamically. In such environments, traditional approaches to index building and maintenance cannot apply. Database cracking has been proposed as a solution that allows on-the-fly physical data reorganization, as a collateral effect of query processing. Cracking aims to continuously and automatically adapt indexes to the workload at hand, without human intervention. Indexes are built incrementally, adaptively, and on demand. Nevertheless, as we show, existing adaptive indexing methods fail to deliver workload-robustness; they perform much better with random workloads than with others. This frailty derives from the inelasticity with which these approaches interpret each query as a hint on how data should be stored. Current cracking schemes blindly reorganize the data within each query's range, even if that results into successive expensive operations with minimal indexing benefit. In this paper, we introduce stochastic cracking, a significantly more resilient approach to adaptive indexing. Stochastic cracking also uses each query as a hint on how to reorganize data, but not blindly so; it gains resilience and avoids performance bottlenecks by deliberately applying certain arbitrary choices in its decision-making. Thereby, we bring adaptive indexing forward to a mature formulation that confers the workload-robustness previous approaches lacked. Our extensive experimental study verifies that stochastic cracking maintains the desired properties of original database cracking while at the same time it performs well with diverse realistic workloads.Comment: VLDB201

    Active SLAM for autonomous underwater exploration

    Get PDF
    Exploration of a complex underwater environment without an a priori map is beyond the state of the art for autonomous underwater vehicles (AUVs). Despite several efforts regarding simultaneous localization and mapping (SLAM) and view planning, there is no exploration framework, tailored to underwater vehicles, that faces exploration combining mapping, active localization, and view planning in a unified way. We propose an exploration framework, based on an active SLAM strategy, that combines three main elements: a view planner, an iterative closest point algorithm (ICP)-based pose-graph SLAM algorithm, and an action selection mechanism that makes use of the joint map and state entropy reduction. To demonstrate the benefits of the active SLAM strategy, several tests were conducted with the Girona 500 AUV, both in simulation and in the real world. The article shows how the proposed framework makes it possible to plan exploratory trajectories that keep the vehicle’s uncertainty bounded; thus, creating more consistent maps.Peer ReviewedPostprint (published version

    Toward automated earned value tracking using 3D imaging tools

    Get PDF

    Distributed top-k aggregation queries at large

    Get PDF
    Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network
    corecore