1,850 research outputs found

    Incremental file reorganization schemes

    Get PDF
    Issued as Final project report, Project no. G-36-66

    Improving I/O Performance for Exascale Applications through Online Data Layout Reorganization

    Get PDF
    The applications being developed within the U.S. Exascale Computing Project (ECP) to run on imminent Exascale computers will generate scientific results with unprecedented fidelity and record turn-around time. Many of these codes are based on particle-mesh methods and use advanced algorithms, especially dynamic load-balancing and mesh-refinement, to achieve high performance on Exascale machines. Yet, as such algorithms improve parallel application efficiency, they raise new challenges for I/O logic due to their irregular and dynamic data distributions. Thus, while the enormous data rates of Exascale simulations already challenge existing file system write strategies, the need for efficient read and processing of generated data introduces additional constraints on the data layout strategies that can be used when writing data to secondary storage. We review these I/O challenges and introduce two online data layout reorganization approaches for achieving good tradeoffs between read and write performance. We demonstrate the benefits of using these two approaches for the ECP particle-in-cell simulation WarpX, which serves as a motif for a large class of important Exascale applications. We show that by understanding application I/O patterns and carefully designing data layouts we can increase read performance by more than 80 percent

    Matching non-uniformity for program optimizations on heterogeneous many-core systems

    Get PDF
    As computing enters an era of heterogeneity and massive parallelism, it exhibits a distinct feature: the deepening non-uniform relations among the computing elements in both hardware and software. Besides traditional non-uniform memory accesses, much deeper non-uniformity shows in a processor, runtime, and application, exemplified by the asymmetric cache sharing, memory coalescing, and thread divergences on multicore and many-core processors. Being oblivious to the non-uniformity, current applications fail to tap into the full potential of modern computing devices.;My research presents a systematic exploration into the emerging property. It examines the existence of such a property in modern computing, its influence on computing efficiency, and the challenges for establishing a non-uniformity--aware paradigm. I propose several techniques to translate the property into efficiency, including data reorganization to eliminate non-coalesced accesses, asynchronous data transformations for locality enhancement and a controllable scheduling for exploiting non-uniformity among thread blocks. The experiments show much promise of these techniques in maximizing computing throughput, especially for programs with complex data access patterns

    The Ubiquitous B-tree: Volume II

    Get PDF
    Major developments relating to the B-tree from early 1979 through the fall of 1986 are presented. This updates the well-known article, The Ubiquitous B-Tree by Douglas Comer (Computing Surveys, June 1979). After a basic overview of B and B+ trees, recent research is cited as well as descriptions of nine B-tree variants developed since Comer\u27s article. The advantages and disadvantages of each variant over the basic B-tree are emphasized. Also included are a discussion of concurrency control issues in B-trees and a speculation on the future of B-trees

    Efficient bulk-loading methods for temporal and multidimensional index structures

    Get PDF
    Nahezu alle naturwissenschaftlichen Bereiche profitieren von neuesten Analyse- und Verarbeitungsmethoden für große Datenmengen. Diese Verfahren setzten eine effiziente Verarbeitung von geo- und zeitbezogenen Daten voraus, da die Zeit und die Position wichtige Attribute vieler Daten sind. Die effiziente Anfrageverarbeitung wird insbesondere durch den Einsatz von Indexstrukturen ermöglicht. Im Fokus dieser Arbeit liegen zwei Indexstrukturen: Multiversion B-Baum (MVBT) und R-Baum. Die erste Struktur wird für die Verwaltung von zeitbehafteten Daten, die zweite für die Indexierung von mehrdimensionalen Rechteckdaten eingesetzt. Ständig- und schnellwachsendes Datenvolumen stellt eine große Herausforderung an die Informatik dar. Der Aufbau und das Aktualisieren von Indexen mit herkömmlichen Methoden (Datensatz für Datensatz) ist nicht mehr effizient. Um zeitnahe und kosteneffiziente Datenverarbeitung zu ermöglichen, werden Verfahren zum schnellen Laden von Indexstrukturen dringend benötigt. Im ersten Teil der Arbeit widmen wir uns der Frage, ob es ein Verfahren für das Laden von MVBT existiert, das die gleiche I/O-Komplexität wie das externe Sortieren besitz. Bis jetzt blieb diese Frage unbeantwortet. In dieser Arbeit haben wir eine neue Kostruktionsmethode entwickelt und haben gezeigt, dass diese gleiche Zeitkomplexität wie das externe Sortieren besitzt. Dabei haben wir zwei algorithmische Techniken eingesetzt: Gewichts-Balancierung und Puffer-Bäume. Unsere Experimenten zeigen, dass das Resultat nicht nur theoretischer Bedeutung ist. Im zweiten Teil der Arbeit beschäftigen wir uns mit der Frage, ob und wie statistische Informationen über Geo-Anfragen ausgenutzt werden können, um die Anfrageperformanz von R-Bäumen zu verbessern. Unsere neue Methode verwendet Informationen wie Seitenverhältnis und Seitenlängen eines repräsentativen Anfragerechtecks, um einen guten R-Baum bezüglich eines häufig eingesetzten Kostenmodells aufzubauen. Falls diese Informationen nicht verfügbar sind, optimieren wir R-Bäume bezüglich der Summe der Volumina von minimal umgebenden Rechtecken der Blattknoten. Da das Problem des Aufbaus von optimalen R-Bäumen bezüglich dieses Kostenmaßes NP-hart ist, führen wir zunächst das Problem auf ein eindimensionales Partitionierungsproblem zurück, indem wir die Daten bezüglich optimierte raumfüllende Kurven sortieren. Dann lösen wir dieses Problem durch Einsatz vom dynamischen Programmieren. Die I/O-Komplexität des Verfahrens ist gleich der von externem Sortieren, da die I/O-Laufzeit der Methode durch die Laufzeit des Sortierens dominiert wird. Im letzten Teil der Arbeit haben wir die entwickelten Partitionierungsvefahren für den Aufbau von Geo-Histogrammen eingesetzt, da diese ähnlich zu R-Bäumen eine disjunkte Partitionierung des Raums erzeugen. Ergebnisse von intensiven Experimenten zeigen, dass sich unter Verwendung von neuen Partitionierungstechniken sowohl R-Bäume mit besserer Anfrageperformanz als auch Geo-Histogrammen mit besserer Schätzqualität im Vergleich zu Konkurrenzverfahren generieren lassen

    Graph Processing in Main-Memory Column Stores

    Get PDF
    Evermore, novel and traditional business applications leverage the advantages of a graph data model, such as the offered schema flexibility and an explicit representation of relationships between entities. As a consequence, companies are confronted with the challenge of storing, manipulating, and querying terabytes of graph data for enterprise-critical applications. Although these business applications operate on graph-structured data, they still require direct access to the relational data and typically rely on an RDBMS to keep a single source of truth and access. Existing solutions performing graph operations on business-critical data either use a combination of SQL and application logic or employ a graph data management system. For the first approach, relying solely on SQL results in poor execution performance caused by the functional mismatch between typical graph operations and the relational algebra. To the worse, graph algorithms expose a tremendous variety in structure and functionality caused by their often domain-specific implementations and therefore can be hardly integrated into a database management system other than with custom coding. Since the majority of these enterprise-critical applications exclusively run on relational DBMSs, employing a specialized system for storing and processing graph data is typically not sensible. Besides the maintenance overhead for keeping the systems in sync, combining graph and relational operations is hard to realize as it requires data transfer across system boundaries. A basic ingredient of graph queries and algorithms are traversal operations and are a fundamental component of any database management system that aims at storing, manipulating, and querying graph data. Well-established graph traversal algorithms are standalone implementations relying on optimized data structures. The integration of graph traversals as an operator into a database management system requires a tight integration into the existing database environment and a development of new components, such as a graph topology-aware optimizer and accompanying graph statistics, graph-specific secondary index structures to speedup traversals, and an accompanying graph query language. In this thesis, we introduce and describe GRAPHITE, a hybrid graph-relational data management system. GRAPHITE is a performance-oriented graph data management system as part of an RDBMS allowing to seamlessly combine processing of graph data with relational data in the same system. We propose a columnar storage representation for graph data to leverage the already existing and mature data management and query processing infrastructure of relational database management systems. At the core of GRAPHITE we propose an execution engine solely based on set operations and graph traversals. Our design is driven by the observation that different graph topologies expose different algorithmic requirements to the design of a graph traversal operator. We derive two graph traversal implementations targeting the most common graph topologies and demonstrate how graph-specific statistics can be leveraged to select the optimal physical traversal operator. To accelerate graph traversals, we devise a set of graph-specific, updateable secondary index structures to improve the performance of vertex neighborhood expansion. Finally, we introduce a domain-specific language with an intuitive programming model to extend graph traversals with custom application logic at runtime. We use the LLVM compiler framework to generate efficient code that tightly integrates the user-specified application logic with our highly optimized built-in graph traversal operators. Our experimental evaluation shows that GRAPHITE can outperform native graph management systems by several orders of magnitude while providing all the features of an RDBMS, such as transaction support, backup and recovery, security and user management, effectively providing a promising alternative to specialized graph management systems that lack many of these features and require expensive data replication and maintenance processes
    • …
    corecore