1,862 research outputs found

    Integrating the UB-Tree into a Database System Kernel

    Get PDF
    Multidimensional access methods have shown high potential for significant performance improvements in various application domains

    Relační přístup k indexaci

    Get PDF
    Za účelem efektivního vyhodnocení SQL dotazů mohou uživatelé databázových systému využít Řadu specializovaných přístupových metod, které se obecně nazývají indexy. V některých případech však nemusí být množina metod poskytovaná databázovým systémem dostačující. Jednou z možností, jak implementovat nový index v relačním SNBD, je využít tabulek daného systému. Tento přístup nevyžaduje změny v jádru databázového systému a je tak dostupný vývojářům i v případě, že cílový SNBD není distribuován jako open source. V rozšiřitelné databázové architektuře je tak vyžadována pouze možnost přidat nový datový typ do stávajícího SNBD. V této práci byl zmíněným způsobem integrován UB-strom do SNBD Oracle. Relační tabulky související s indexem byly navrženy dvěma různými způsoby, zároveň byly zkoumány čtyři metody pro vyhodnocení relevantních SQL dotazů. V rámci experimentů bylo pak implementované řešení relačního indexu porovnáno s nativním nasazením téhož indexu.In order to achieve efficient evaluation of SQL queries, database systems provide its users with set of integrated index access methods. When a new access method is required for various reasons, one of the possibilities to implement such method in a relational DBMS is the way of exploiting relational tables of given database system. This approach does not involve any internal changes of database system kernel and thus it is available to all developers even when the target DBMS is not distributed as an open source. In the terms of extensible database architecture, only the availability to extend existing DBMS with a new data type is required. In this work, UB-Tree index has been integrated into Oracle DBMS in such way. Index related tables have been designed in two different ways and four alternatives to evaluate relevant queries have been proposed and studied. Finally, several experiments have been done to compare performance of an access method implemented via the relational approach and a native kernel integration of the same method.Department of Software EngineeringKatedra softwarového inženýrstvíMatematicko-fyzikální fakultaFaculty of Mathematics and Physic

    Dynamic allocation optimization in A/B tests using classification-based preprocessing

    Get PDF
    An A/B test evaluates the impact of a new technology by running it in a real production environment and testing its performance on a set of items. Recently, promising new methods are optimizing A/B tests with dynamic allocation. They allow for a quicker result regarding which variation (A or B) is the best, saving money for the user. However, dynamic allocation by traditional methods requires certain assumptions, which are not always verified in reality. This is mainly due to the fact that the populations tested are not homogeneous. This document reports on the new reinforcement learning methodology which has been deployed by the commercial A/B testing platform AB Tasty. We provide a new method that not only builds homogeneous groups for a user, but also allows to find the best variation for these groups in a short period of time. This paper provides numerical results on AB Tasty data, but also on public data sets, to demonstrate an improvement in A/B testing over traditional methods

    One stone, two birds: A lightweight multidimensional learned index with cardinality support

    Full text link
    Innovative learning based structures have recently been proposed to tackle index and cardinality estimation tasks, specifically learned indexes and data driven cardinality estimators. These structures exhibit excellent performance in capturing data distribution, making them promising for integration into AI driven database kernels. However, accurate estimation for corner case queries requires a large number of network parameters, resulting in higher computing resources on expensive GPUs and more storage overhead. Additionally, the separate implementation for CE and learned index result in a redundancy waste by storage of single table distribution twice. These present challenges for designing AI driven database kernels. As in real database scenarios, a compact kernel is necessary to process queries within a limited storage and time budget. Directly integrating these two AI approaches would result in a heavy and complex kernel due to a large number of network parameters and repeated storage of data distribution parameters. Our proposed CardIndex structure effectively killed two birds with one stone. It is a fast multidim learned index that also serves as a lightweight cardinality estimator with parameters scaled at the KB level. Due to its special structure and small parameter size, it can obtain both CDF and PDF information for tuples with an incredibly low latency of 1 to 10 microseconds. For tasks with low selectivity estimation, we did not increase the model's parameters to obtain fine grained point density. Instead, we fully utilized our structure's characteristics and proposed a hybrid estimation algorithm in providing fast and exact results

    DBEst : revisiting approximate query processing engines with machine learning models

    Get PDF
    In the era of big data, computing exact answers to analytical queries becomes prohibitively expensive. This greatly increases the value of approaches that can compute efficiently approximate, but highly-accurate, answers to analytical queries. Alas, the state of the art still suffers from many shortcomings: Errors are still high, unless large memory investments are made. Many important analytics tasks are not supported. Query response times are too long and thus approaches rely on parallel execution of queries atop large big data analytics clusters, in-situ or in the cloud, whose acquisition/use costs dearly. Hence, the following questions are crucial: Can we develop AQP engines that reduce response times by orders of magnitude, ensure high accuracy, and support most aggregate functions? With smaller memory footprints and small overheads to build the state upon which they are based? With this paper, we show that the answers to all questions above can be positive. The paper presents DBEst, a system based on Machine Learning models (regression models and probability density estimators). It will discuss its limitations, promises, and how it can complement existing systems. It will substantiate its advantages using queries and data from the TPC-DS benchmark and real-life datasets, compared against state of the art AQP engines

    Second CLIPS Conference Proceedings, volume 1

    Get PDF
    Topics covered at the 2nd CLIPS Conference held at the Johnson Space Center, September 23-25, 1991 are given. Topics include rule groupings, fault detection using expert systems, decision making using expert systems, knowledge representation, computer aided design and debugging expert systems
    corecore