    Accessing multiversion data in database transactions

    Many important database applications need to access previous versions of the data set, thus requiring that the data are stored in a multiversion database and indexed with a multiversion index, such as the multiversion B+-tree (MVBT) of Becker et al. The MVBT is optimal, so that any version of the database can be accessed as efficiently as with a single-version B+-tree that is used to index only the data items of that version, but it cannot be used in a full-fledged database system because it follows a single-update model, and the update cannot be rolled back. We have redesigned the MVBT index so that a single multi-action updating transaction can operate on the index structure concurrently with multiple concurrent read-only transactions. Data items created by the transaction become part of the same version, and the transaction can roll back. We call this structure the transactional MVBT (TMVBT). The TMVBT index remains optimal even in the presence of logical key deletions. Even though deletions in a multiversion index must not physically delete the history of the data items, queries and range scans can become more efficient, if the leaf pages of the index structure are merged to retain optimality. For the general transactional setting with multiple updating transactions, we propose a multiversion database structure called the concurrent MVBT (CMVBT), which stores the updates of active transactions in a separate main-memory-resident versioned B+-tree index. A system maintenance transaction is periodically run to apply the updates of committed transactions into the TMVBT index. We show how multiple updating transactions can operate on the CMVBT index concurrently, and our recovery algorithm is based on the standard ARIES recovery algorithm. We prove that the TMVBT index is asymptotically optimal, and show that the performance of the CMVBT index in general transaction processing is on par with the performance of the time-split B+-tree (TSB-tree) of Lomet and Salzberg. The TSB-tree does not merge leaf pages and is therefore not optimal if logical data-item deletions are allowed. Our experiments show that the CMVBT outperforms the TSB-tree with range queries in the presence of deletions

    Efficient bulk-loading methods for temporal and multidimensional index structures

    Nahezu alle naturwissenschaftlichen Bereiche profitieren von neuesten Analyse- und Verarbeitungsmethoden fĂŒr große Datenmengen. Diese Verfahren setzten eine effiziente Verarbeitung von geo- und zeitbezogenen Daten voraus, da die Zeit und die Position wichtige Attribute vieler Daten sind. Die effiziente Anfrageverarbeitung wird insbesondere durch den Einsatz von Indexstrukturen ermöglicht. Im Fokus dieser Arbeit liegen zwei Indexstrukturen: Multiversion B-Baum (MVBT) und R-Baum. Die erste Struktur wird fĂŒr die Verwaltung von zeitbehafteten Daten, die zweite fĂŒr die Indexierung von mehrdimensionalen Rechteckdaten eingesetzt. StĂ€ndig- und schnellwachsendes Datenvolumen stellt eine große Herausforderung an die Informatik dar. Der Aufbau und das Aktualisieren von Indexen mit herkömmlichen Methoden (Datensatz fĂŒr Datensatz) ist nicht mehr effizient. Um zeitnahe und kosteneffiziente Datenverarbeitung zu ermöglichen, werden Verfahren zum schnellen Laden von Indexstrukturen dringend benötigt. Im ersten Teil der Arbeit widmen wir uns der Frage, ob es ein Verfahren fĂŒr das Laden von MVBT existiert, das die gleiche I/O-KomplexitĂ€t wie das externe Sortieren besitz. Bis jetzt blieb diese Frage unbeantwortet. In dieser Arbeit haben wir eine neue Kostruktionsmethode entwickelt und haben gezeigt, dass diese gleiche ZeitkomplexitĂ€t wie das externe Sortieren besitzt. Dabei haben wir zwei algorithmische Techniken eingesetzt: Gewichts-Balancierung und Puffer-BĂ€ume. Unsere Experimenten zeigen, dass das Resultat nicht nur theoretischer Bedeutung ist. Im zweiten Teil der Arbeit beschĂ€ftigen wir uns mit der Frage, ob und wie statistische Informationen ĂŒber Geo-Anfragen ausgenutzt werden können, um die Anfrageperformanz von R-BĂ€umen zu verbessern. Unsere neue Methode verwendet Informationen wie SeitenverhĂ€ltnis und SeitenlĂ€ngen eines reprĂ€sentativen Anfragerechtecks, um einen guten R-Baum bezĂŒglich eines hĂ€ufig eingesetzten Kostenmodells aufzubauen. Falls diese Informationen nicht verfĂŒgbar sind, optimieren wir R-BĂ€ume bezĂŒglich der Summe der Volumina von minimal umgebenden Rechtecken der Blattknoten. Da das Problem des Aufbaus von optimalen R-BĂ€umen bezĂŒglich dieses Kostenmaßes NP-hart ist, fĂŒhren wir zunĂ€chst das Problem auf ein eindimensionales Partitionierungsproblem zurĂŒck, indem wir die Daten bezĂŒglich optimierte raumfĂŒllende Kurven sortieren. Dann lösen wir dieses Problem durch Einsatz vom dynamischen Programmieren. Die I/O-KomplexitĂ€t des Verfahrens ist gleich der von externem Sortieren, da die I/O-Laufzeit der Methode durch die Laufzeit des Sortierens dominiert wird. Im letzten Teil der Arbeit haben wir die entwickelten Partitionierungsvefahren fĂŒr den Aufbau von Geo-Histogrammen eingesetzt, da diese Ă€hnlich zu R-BĂ€umen eine disjunkte Partitionierung des Raums erzeugen. Ergebnisse von intensiven Experimenten zeigen, dass sich unter Verwendung von neuen Partitionierungstechniken sowohl R-BĂ€ume mit besserer Anfrageperformanz als auch Geo-Histogrammen mit besserer SchĂ€tzqualitĂ€t im Vergleich zu Konkurrenzverfahren generieren lassen

    Advance of the Access Methods

    The goal of this paper is to outline the advance of the access methods in the last ten years as well as to make review of all available in the accessible bibliography methods

    A Strategy for Reducing I/O and Improving Query Processing Time in an Oracle Data Warehouse Environment

    In the current information age as the saying goes, time is money. For the modern information worker, decisions must often be made quickly. Every extra minute spent waiting for critical data could mean the difference between financial gain and financial ruin. Despite the importance of timely data retrieval, many organizations lack even a basic strategy for improving the performance of their data warehouse based reporting systems. This project explores the idea that a strategy making use of three database performance improvement techniques can reduce I/O (input/output operations) and improve query processing time in an information system designed for reporting. To demonstrate that these performance improvement goals can be achieved, queries were run on ordinary tables and then on tables utilizing the performance improvement techniques. The I/O statistics and processing times for the queries were compared to measure the amount of performance improvement. The measurements were also used to explain how these techniques may be more or less effective under certain circumstances, such as when a particular type of query is run. The collected I/O and time based measurements showed a varying degree of improvement for each technique based on the query used. A need to match the types of queries commonly run on the system to the performance improvement technique being implemented was found to be an important consideration. The results indicated that in a reporting environment these performance improvement techniques have the potential to reduce I/O and improve query performance

    10381 Summary and Abstracts Collection -- Robust Query Processing

    Dagstuhl seminar 10381 on robust query processing (held 19.09.10 - 24.09.10) brought together a diverse set of researchers and practitioners with a broad range of expertise for the purpose of fostering discussion and collaboration regarding causes, opportunities, and solutions for achieving robust query processing. The seminar strove to build a unified view across the loosely-coupled system components responsible for the various stages of database query processing. Participants were chosen for their experience with database query processing and, where possible, their prior work in academic research or in product development towards robustness in database query processing. In order to pave the way to motivate, measure, and protect future advances in robust query processing, seminar 10381 focused on developing tests for measuring the robustness of query processing. In these proceedings, we first review the seminar topics, goals, and results, then present abstracts or notes of some of the seminar break-out sessions. We also include, as an appendix, the robust query processing reading list that was collected and distributed to participants before the seminar began, as well as summaries of a few of those papers that were contributed by some participants

    Design of efficient and elastic storage in the cloud

    Final Report: Efficient Databases for MPC Microdata

    Design and performance evaluation of indexing methods for dynamic attributes in mobile database management systems

    Ankara : Department of Computer Engineering and Information Science and the Institute of Engineering and Science of Bilkent University, 1997.Thesis(Master's) -- Bilkent University, 1997.Includes bibliographical references leaves 99-104.Tayeb, JamelM.S