441 research outputs found

    Transactional and analytical data management on persistent memory

    Get PDF
    Die zunehmende Anzahl von Smart-Geräten und Sensoren, aber auch die sozialen Medien lassen das Datenvolumen und damit die geforderte Verarbeitungsgeschwindigkeit stetig wachsen. Gleichzeitig müssen viele Anwendungen Daten persistent speichern oder sogar strenge Transaktionsgarantien einhalten. Die neuartige Speichertechnologie Persistent Memory (PMem) mit ihren einzigartigen Eigenschaften scheint ein natürlicher Anwärter zu sein, um diesen Anforderungen effizient nachzukommen. Sie ist im Vergleich zu DRAM skalierbarer, günstiger und dauerhaft. Im Gegensatz zu Disks ist sie deutlich schneller und direkt adressierbar. Daher wird in dieser Dissertation der gezielte Einsatz von PMem untersucht, um den Anforderungen moderner Anwendung gerecht zu werden. Nach der Darlegung der grundlegenden Arbeitsweise von und mit PMem, konzentrieren wir uns primär auf drei Aspekte der Datenverwaltung. Zunächst zerlegen wir mehrere persistente Daten- und Indexstrukturen in ihre zugrundeliegenden Entwurfsprimitive, um Abwägungen für verschiedene Zugriffsmuster aufzuzeigen. So können wir ihre besten Anwendungsfälle und Schwachstellen, aber auch allgemeine Erkenntnisse über das Entwerfen von PMem-basierten Datenstrukturen ermitteln. Zweitens schlagen wir zwei Speicherlayouts vor, die auf analytische Arbeitslasten abzielen und eine effiziente Abfrageausführung auf beliebigen Attributen ermöglichen. Während der erste Ansatz eine verknüpfte Liste von mehrdimensionalen gruppierten Blöcken verwendet, handelt es sich beim zweiten Ansatz um einen mehrdimensionalen Index, der Knoten im DRAM zwischenspeichert. Drittens zeigen wir unter Verwendung der bisherigen Datenstrukturen und Erkenntnisse, wie Datenstrom- und Ereignisverarbeitungssysteme mit transaktionaler Zustandsverwaltung verbessert werden können. Dabei schlagen wir ein neuartiges Transactional Stream Processing (TSP) Modell mit geeigneten Konsistenz- und Nebenläufigkeitsprotokollen vor, die an PMem angepasst sind. Zusammen sollen die diskutierten Aspekte eine Grundlage für die Entwicklung noch ausgereifterer PMem-fähiger Systeme bilden. Gleichzeitig zeigen sie, wie Datenverwaltungsaufgaben PMem ausnutzen können, indem sie neue Anwendungsgebiete erschließen, die Leistung, Skalierbarkeit und Wiederherstellungsgarantien verbessern, die Codekomplexität vereinfachen sowie die ökonomischen und ökologischen Kosten reduzieren.The increasing number of smart devices and sensors, but also social media are causing the volume of data and thus the demanded processing speed to grow steadily. At the same time, many applications need to store data persistently or even comply with strict transactional guarantees. The novel storage technology Persistent Memory (PMem), with its unique properties, seems to be a natural candidate to meet these requirements efficiently. Compared to DRAM, it is more scalable, less expensive, and durable. In contrast to disks, it is significantly faster and directly addressable. Therefore, this dissertation investigates the deliberate employment of PMem to fit the needs of modern applications. After presenting the fundamental work of and with PMem, we focus primarily on three aspects of data management. First, we disassemble several persistent data and index structures into their underlying design primitives to reveal the trade-offs for various access patterns. It allows us to identify their best use cases and vulnerabilities but also to gain general insights into the design of PMem-based data structures. Second, we propose two storage layouts that target analytical workloads and enable an efficient query execution on arbitrary attributes. While the first approach employs a linked list of multi-dimensional clustered blocks that potentially span several storage layers, the second approach is a multi-dimensional index that caches nodes in DRAM. Third, we show how to improve stream and event processing systems involving transactional state management using the preceding data structures and insights. In this context, we propose a novel Transactional Stream Processing (TSP) model with appropriate consistency and concurrency protocols adapted to PMem. Together, the discussed aspects are intended to provide a foundation for developing even more sophisticated PMemenabled systems. At the same time, they show how data management tasks can take advantage of PMem by opening up new application domains, improving performance, scalability, and recovery guarantees, simplifying code complexity, plus reducing economic and environmental costs

    An Efficient Way to Allocate and Read Directory Entries in the Ext4 File System

    Get PDF
    C­lem t©to prce je zvit vkon sekvenÄn­ho prochzen­ adres v souborov©m syst©mu ext4. Datov struktura HTree, jen je v souÄasn© dobÄ pouita k implementaci adresu v ext4 zvld velmi dobe nhodn© p­stupy do adrese, avak nen­ optimalizovna pro sekvenÄn­ prochzen­. Tato prce pin­ analzu tohoto probl©mu. Nejprve studuje implementaci souborov©ho syst©mu ext4 a dal­ch subsyst©mu Linuxov©ho jdra, kter© s n­m souvis­. Pro vyhodnocen­ vkonu souÄasn© implementace adresov©ho indexu byla vytvoena sada test. Na zkladÄ vsledk tÄchto test bylo navreno een­, kter© bylo nslednÄ implementovno do Linuxov©ho jdra. V zvÄru t©to prce naleznete vyhodnocen­ p­nosu a porovnn­ vkonu nov© implementace s dal­mi souborovmi syst©my v Linuxu.The aim of this thesis is to improve the performance of sequential directory traversal in the ext4 file system. The HTree data structure that is used to store directories in ext4 at the moment works very well for random accesses, however, it is not optimal when it comes to traversing a directory sequentially. This thesis investigates the issue; it explores the implementation of ext4 and the associated Linux kernel subsystems. To assess the performance of the directory index, a set of test cases and benchmarks was implemented. Based on the analysis, an optimization was designed and implemented to the ext4 driver within the Linux kernel. The implementation was tested, evaluated, and compared to other native Linux file systems in the last chapter of this document.

    Architectural Principles for Database Systems on Storage-Class Memory

    Get PDF
    Database systems have long been optimized to hide the higher latency of storage media, yielding complex persistence mechanisms. With the advent of large DRAM capacities, it became possible to keep a full copy of the data in DRAM. Systems that leverage this possibility, such as main-memory databases, keep two copies of the data in two different formats: one in main memory and the other one in storage. The two copies are kept synchronized using snapshotting and logging. This main-memory-centric architecture yields nearly two orders of magnitude faster analytical processing than traditional, disk-centric ones. The rise of Big Data emphasized the importance of such systems with an ever-increasing need for more main memory. However, DRAM is hitting its scalability limits: It is intrinsically hard to further increase its density. Storage-Class Memory (SCM) is a group of novel memory technologies that promise to alleviate DRAM’s scalability limits. They combine the non-volatility, density, and economic characteristics of storage media with the byte-addressability and a latency close to that of DRAM. Therefore, SCM can serve as persistent main memory, thereby bridging the gap between main memory and storage. In this dissertation, we explore the impact of SCM as persistent main memory on database systems. Assuming a hybrid SCM-DRAM hardware architecture, we propose a novel software architecture for database systems that places primary data in SCM and directly operates on it, eliminating the need for explicit IO. This architecture yields many benefits: First, it obviates the need to reload data from storage to main memory during recovery, as data is discovered and accessed directly in SCM. Second, it allows replacing the traditional logging infrastructure by fine-grained, cheap micro-logging at data-structure level. Third, secondary data can be stored in DRAM and reconstructed during recovery. Fourth, system runtime information can be stored in SCM to improve recovery time. Finally, the system may retain and continue in-flight transactions in case of system failures. However, SCM is no panacea as it raises unprecedented programming challenges. Given its byte-addressability and low latency, processors can access, read, modify, and persist data in SCM using load/store instructions at a CPU cache line granularity. The path from CPU registers to SCM is long and mostly volatile, including store buffers and CPU caches, leaving the programmer with little control over when data is persisted. Therefore, there is a need to enforce the order and durability of SCM writes using persistence primitives, such as cache line flushing instructions. This in turn creates new failure scenarios, such as missing or misplaced persistence primitives. We devise several building blocks to overcome these challenges. First, we identify the programming challenges of SCM and present a sound programming model that solves them. Then, we tackle memory management, as the first required building block to build a database system, by designing a highly scalable SCM allocator, named PAllocator, that fulfills the versatile needs of database systems. Thereafter, we propose the FPTree, a highly scalable hybrid SCM-DRAM persistent B+-Tree that bridges the gap between the performance of transient and persistent B+-Trees. Using these building blocks, we realize our envisioned database architecture in SOFORT, a hybrid SCM-DRAM columnar transactional engine. We propose an SCM-optimized MVCC scheme that eliminates write-ahead logging from the critical path of transactions. Since SCM -resident data is near-instantly available upon recovery, the new recovery bottleneck is rebuilding DRAM-based data. To alleviate this bottleneck, we propose a novel recovery technique that achieves nearly instant responsiveness of the database by accepting queries right after recovering SCM -based data, while rebuilding DRAM -based data in the background. Additionally, SCM brings new failure scenarios that existing testing tools cannot detect. Hence, we propose an online testing framework that is able to automatically simulate power failures and detect missing or misplaced persistence primitives. Finally, our proposed building blocks can serve to build more complex systems, paving the way for future database systems on SCM

    Algorithms and Data Structures for Automated Change Detection and Classification of Sidescan Sonar Imagery

    Get PDF
    During Mine Warfare (MIW) operations, MIW analysts perform change detection by visually comparing historical sidescan sonar imagery (SSI) collected by a sidescan sonar with recently collected SSI in an attempt to identify objects (which might be explosive mines) placed at sea since the last time the area was surveyed. This dissertation presents a data structure and three algorithms, developed by the author, that are part of an automated change detection and classification (ACDC) system. MIW analysts at the Naval Oceanographic Office, to reduce the amount of time to perform change detection, are currently using ACDC. The dissertation introductory chapter gives background information on change detection, ACDC, and describes how SSI is produced from raw sonar data. Chapter 2 presents the author\u27s Geospatial Bitmap (GB) data structure, which is capable of storing information geographically and is utilized by the three algorithms. This chapter shows that a GB data structure used in a polygon-smoothing algorithm ran between 1.3 – 48.4x faster than a sparse matrix data structure. Chapter 3 describes the GB clustering algorithm, which is the author\u27s repeatable, order-independent method for clustering. Results from tests performed in this chapter show that the time to cluster a set of points is not affected by the distribution or the order of the points. In Chapter 4, the author presents his real-time computer-aided detection (CAD) algorithm that automatically detects mine-like objects on the seafloor in SSI. The author ran his GB-based CAD algorithm on real SSI data, and results of these tests indicate that his real-time CAD algorithm performs comparably to or better than other non-real-time CAD algorithms. The author presents his computer-aided search (CAS) algorithm in Chapter 5. CAS helps MIW analysts locate mine-like features that are geospatially close to previously detected features. A comparison between the CAS and a great circle distance algorithm shows that the CAS performs geospatial searching 1.75x faster on large data sets. Finally, the concluding chapter of this dissertation gives important details on how the completed ACDC system will function, and discusses the author\u27s future research to develop additional algorithms and data structures for ACDC

    OPTIMIZING CLIENT-SERVER COMMUNICATION FOR REMOTE SPATIAL DATABASE ACCESS

    Get PDF
    Technological advances in recent years have opened ways for easier creation of spatial data. Every day, vast amounts of data are collected by both governmental institutions (e.g., USGS, NASA) and commercial entities (e.g., IKONOS). This process is driven by increased popularity and affordability across the whole spectrum of collection methods, ranging from personal GPS units to satellite systems. Many collection methods such as satellite systems produce data in raster format. Often, such raster data is analyzed by the researchers directly, while at other times such data is used to produce the final dataset in vector format. With the rapidly increasing supply of data, more applications for this data are being developed that are of interest to a wider consumer base. The increasing popularity of spatial data viewers and query tools with end users introduces a requirement for methods to allow these basic users to access this data for viewing and querying instantly and without much effort. In our work, we focus on providing remote access to vector-based spatial data, rather than raster data. We explore new ways of allowing visualization of both spatial and non-spatial data stored in a central server database on a simple client connected to this server by possibly a slow and unreliable connection. We considered usage scenarios where transferring the whole database for processing on the client was not feasible. This is due to the large volume of data stored on the server as well as a lack of computing power on the client and a slow link between the two. We focus on finding an optimal way of distributing work between the server, clients, and possibly other entities introduced into the model for query evaluation and data management. We address issues of scalability for clients that have only limited access to system resources (e.g., a Java applet). Methods to allow these clients to provide an interactive user interface, even for databases of arbitrary size, are also examined

    Projeto, implementação e avaliação do suporte de casamento com prefixo mais longo para IPv4/IPv6 em planos de dados programáveis multi-arquitetura

    Get PDF
    Orientador: Christian Rodolfo Esteve RothenbergDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Dentre as novas tendências em programação de dataplane dentro de SDN (Software Defined Networking) destacam-se os esforços para prover um suporte multi-plataforma dotado de alta definição das informações que são processadas pelo pipeline do plano de dados. No entanto, alguns desafios ainda persistem, como a necessidade de um plano de dados programável ou a adoção de uma abstração de programação independente de protocolo. Como forma de mitigar tais problemas, verifica-se que a Linguagem Específica de Domínio~(DSL) Programming Protocol-Independent Packet Processors~(P4) desponta como uma tendência emergente para expressar como os pacotes são processados pelo plano de dados de uma plataforma de rede programável. De modo independente e em paralelo, constata-se que o projeto OpenDataPlane~(ODP) cria um conjunto de plataformas abertas de Application Programming Interfaces~(APIs) projetado para o plano de dados de rede. Isso posto, tem-se que o Multi-Architecture Compiler System for Abstract Dataplanes~(MACSAD) surge como uma abordagem para convergir P4 e ODP em um processo de compilação convencional, arquivando a portabilidade dos aplicativos de plano de dados sem afetar as melhorias de desempenho do alvo. O MACSAD pode integrar a API do ODP e o P4, reunindo-os e definindo um plano de dados programável em um sistema de compilador unificado. Este trabalho tem como objetivo adicionar o suporte do Longest Prefix Match~(LPM) do IPv4/IPv6 ao MACSAD, integrado com as APIs do ODP e à programação P4, oferecendo recursos de planejamento de dados de alto desempenho. O suporte ao LPM proposto para o MACSAD combina o algoritmo de lookup e a biblioteca da API do ODP com o suporte à tabela MACSAD, para criar uma base de encaminhamento completa usada no processo do LPM. A implementação do IPv4 adapta o atual algoritmo de lookup do ODP para trabalhar com o MACSAD. A implementação de lookup IPv6, atualmente não suportada pelo ODP, é uma extensão do suporte IPv4 que é desenvolvido usando o mesmo algoritmo adaptado a uma chave de 128 bits. A pesquisa IPv4 e IPv6 usa uma base de árvore binária para executar o lookup do LPM. Para a avaliação de desempenho do suporte ao LPM, utilizamos uma ferramenta geradora de tráfego Network Function Performance Analyzer~(NFPA) que permite gerar diferentes tipos de tráfego no MACSAD. Cabe ainda destacar, como uma contribuição lateral deste trabalho, o desenvolvimento da ferramenta geradora de pacote BB-Gen, já com lançamento open source. Resultados experimentais mostram que é possível atingir um throughput de 10G com tamanhos de pacotes de 512 bytes ou superioresAbstract: New trends in dataplane programmability inside Software Defined Networking~(SDN) are in efforts to bring multi-platform support with a high definition of the information that is processed by the dataplane pipeline. However, some challenges are still present, as the necessity of a programmable dataplane or a protocol independent programming abstraction. The Programming Protocol-Independent Packet Processors~(P4) Domain Specific Language (DSL) is an emerging trend to express how the packets are processed by the dataplane of a programmable network platform. In parallel, OpenDataPlane~(ODP) project creates an open-source, cross-platform set of Application Programming Interfaces~(APIs) designed for the networking data plane. Multi-Architecture Compiler System for Abstract Dataplanes~(MACSAD) is an approach to converge P4 and ODP in a conventional compilation process, achieving portability of the dataplane applications without affecting the target performance improvements. MACSAD can integrate the ODP API and the P4, bringing them together and defining a programmable dataplane across multiple targets in a unified compiler system. This work aims at adding IPv4/IPv6 Longest Prefix Match~(LPM) support to MACSAD integrated with ODP APIs and P4 programmability delivering high-performance dataplane capabilities. The proposed LPM support for MACSAD combines the lookup algorithm and the ODP API library with MACSAD table support, to create a complete forwarding base used in the LPM process. The IPv4 implementation adapts the current ODP lookup algorithm to work with MACSAD. IPv6 lookup implementation, currently not supported by ODP, is an extension of the IPv4 support, developed using the same algorithm adapted to a 128-bit key. IPv4 and IPv6 lookup use a binary tree base, to perform the LPM lookup. For the performance evaluation of the LPM support, we use a traffic generator tool Network Function Performance Analyzer~(NFPA) that allows generating different types of traffic across MACSAD. A side contribution on this front we developed and released open source the BB-Gen packet crafter tool. Experimental results show that it is possible to reach a throughput of 10G with packets sizes of 512 Bytes and aboveMestradoEngenharia de ComputaçãoMestre em Engenharia Elétric

    A+ Indexes: Highly Flexible Adjacency Lists in Graph Database Management Systems

    Get PDF
    Adjacency lists are the most fundamental storage structure in existing graph database management systems (GDBMSs) to index input graphs. Adjacency lists are universally linked-list like per-vertex structures that allow access to a set of edges that are all adjacent to a vertex. In several systems, adjacency lists can also allow efficient access to subsets of a vertex’s adjacent edges that satisfy a fixed set of predicates, such as those that have the same label, and support a fixed set of ordering criteria, such as sorting by the ID of destination vertices of the edges. This thesis describes a highly-flexible indexing subsystem for GDBMSs, which consists of two components. The primary component called A+ indexes store adjacency lists, which compared to existing adjacency lists, provide flexibility to users in three aspects: (1) in addition to per-vertex adjacency lists, users can define per-edge adjacency lists; (2) users can define adjacency lists for sets of edges that satisfy a wide range of predicates; and (3) provide flexible sorting criteria. Indexes in existing GDBMS, such as adjacency list, B+ tree, or hash indexes, index as elements the vertices or edges in the input graph. The second component of our indexing sub-system is secondary B+ tree and bitmap indexes that index aggregate properties of adjacency lists in A+ indexes. Therefore, our secondary indexes effectively index adjacency lists as elements. We have implemented our indexing sub-system on top of the Graphflow GDBMS. We describe our indexes, the modifications we had to do to Graphflow’s optimizer, and our implementation. We provide extensive experiments demonstrating both the flexibility and efficiency of our indexes on a large suite of queries from several application domains
    corecore