141 research outputs found

    Indexer++: Workload-Aware Online Index Tuning With Transformers and Reinforcement Learning

    Get PDF
    With the increasing workload complexity in modern databases, the manual process of index selection is a challenging task. There is a growing need for a database with an ability to learn and adapt to evolving workloads. This paper proposes Indexer++, an autonomous, workload-aware, online index tuner. Unlike existing approaches, Indexer++ imposes low overhead on the DBMS, is responsive to changes in query workloads and swiftly selects indexes. Our approach uses a combination of text analytic techniques and reinforcement learning. Indexer++ consist of two phases: Phase (i) learns workload trends using a novel trend detection technique based on a pre-trained transformer model. Phase (ii) performs online, i.e., continuous or while the DBMS is processing workloads, index selection using a novel online deep reinforcement learning technique using our proposed priority experience sweeping. This paper provides an experimental evaluation of Indexer++ in multiple scenarios using benchmark (TPC-H) and real-world datasets (IMDB). In our experiments, Indexer++ effectively identifies changes in workload trends and selects the set of optimal indexes

    Achieving a Sequenced, Relational Query Language with Log-Segmented Timestamps

    Get PDF

    On the Semantics of "Now" in Databases

    Get PDF
    While "now" is expressed in SQL as CURRENT-TIMESTAMP within queries, this value cannot be stored in the database. However, this notion of an ever-increasing current-time value has been reflected in some temporal data models by inclusion of database-resident variables, such as "now," "until-changed," "â," "@" and "-." Time variables are very desirable, but their use also leads to a new type of database, consisting of tuples with variables, termed a variable database. This paper proposes a framework for defining the semantics of the variable databases of temporal relational data models. A framework is presented because several reasonable meanings may be given to databases that use some of the specific temporal variables that have appeared in the literature. Using the framework, the paper defines a useful semantics for such databases. Because situations occur where the existing time variables are inadequate, two new types of modeling entities that address these shortcomings, timestamps which we call now-relative and now-relative indeterminate, are introduced and defined within the framework. Moreover, the paper provides a foundation, using algebraic bind operators, for the querying of variable databases via existing query languages. This transition to variable databases presented here requires minimal change to the query processor. Finally, to underline the practical feasibility of variable databases, we show that database variables can be precisely specified and efficiently implemented in conventional query languages, such as SQL, and in temporal query languages, such as TSQL2.Information Systems Working Papers Serie

    Content-based Navigation in a Mini-World Web

    No full text
    Several database query languages have recently been developed to locate and retrieve documents in the vast network of World-Wide Web pages. These languages combine path expressions, which specify the structure of a path through the network to the desired information, with content predicates, which force the path to pass through pages with particular content. The straightforward implementation of these languages is based on breadth-first search of the network, with heavy reliance placed on the user's understanding of network topology to both direct and constrain the search via the appropriate use of the path expressions. In this paper we describe a system that removes the reliance on path expressions to safeguard the search during a query and enables the user to navigate by refining content rather than by specifying structure. Our system uses a cost-constrained model for query evaluation. Links between pages are assigned costs. The user controls how far a query can navigate by specify..

    Automatic Filtering of Now-centric Data

    No full text
    A now-centric collection of data is characterised by the property that as data in the collection ages, each datum individually becomes less relevant, but remains relevant in aggregate. Such data can be filtered by materialising an aggregate view on the data and then compressing, moving to backup, or deleting the data from which that view was materialised, yielding a smaller collection of data. This paper describes a tool to automatically filter data by building a statistical database from the now-centric collection of data. To build the statistical database, the user supplies a list of filters. Each filter consists of a filter unit and a filter measure. The filter unit specifies a pattern (a regular expression) to match as the now-centric data is filtered. The filter measure is the system of measurement in which occurrences of that pattern are counted. A key feature of the tool is that users may define their own units and measures. Queries on the filtered data are analysed to determine..
    • …
    corecore