7 research outputs found

    An NVM Aware MariaDB Database System and Associated IO Workload on File Systems

    Get PDF
    MariaDB is a community-developed fork of the MySQL relational database management system and originally designed and implemented in order to use the traditional spinning disk architecture. With Non-Volatile memory (NVM) technology now in the forefront and main stream for server storage (Data centers), MariaDB addresses the need by adding support for NVM devices and introduces NVM Compression method. NVM Compression is a novel hybrid technique that combines application level compression with flash awareness for optimal performance and storage efficiency. Utilizing new interface primitives exported by Flash Translation Layers (FTLs), we leverage the garbage collection available in flash devices to optimize the capacity management required by compression systems. We implement NVM Compression in the popular MariaDB database and use variants of commonly available POSIX file system interfaces to provide the extended FTL capabilities to the user space application. The experimental results show that the hybrid approach of NVM Compression can improve compression performance by 2-7x, deliver compression performance for flash devices that is within 5% of uncompressed performance, improve storage efficiency by 19% over legacy Row-Compression, reduce data writes by up to 4x when combined with other flash aware techniques such as Atomic Writes, and deliver further advantages in power efficiency and CPU utilization. Various micro benchmark measurement and findings on sparse files call for required improvement in file systems for handling of punch hole operations on files

    Write-Combined Logging: An Optimized Logging for Consistency in NVRAM

    Get PDF
    Nonvolatile memory (e.g., Phase Change Memory) blurs the boundary between memory and storage and it could greatly facilitate the construction of in-memory durable data structures. Data structures can be processed and stored directly in NVRAM. To maintain the consistency of persistent data, logging is a widely adopted mechanism. However, logging introduces write-twice overhead. This paper introduces an optimized write-combined logging to reduce the writes to NVRAM log. By leveraging the fastread and byte-addressable features of NVRAM, we can perform a read-and-compare operation before writes and thus issue writes in a finer-grained way. We tested our system on the benchmark suit STAMP which contains real-world applications. Experiment results show that our system can reduce the writes to NVRAM by 33%-34%, which can help extend the lifetime of NVRAM and improve performance. Averagely our system can improve performance by 7%-11%

    Write-Combined Logging: An Optimized Logging for Consistency in NVRAM

    Get PDF

    State-preserving container orchestration in failover scenarios

    Get PDF
    Containers have been widely adopted for deployment of high availability applications and services. This adoption is in part due to the native support of fault tolerance mechanisms in container orchestration frameworks such as Kubernetes. While Kubernetes provides service replication as a fault tolerance mechanism for stateless applications, service replication does not satisfy requirements for stateful applications. Currently this shortcoming is addressed by data replication in databases. This requires a tight coupling and modification of the stateful application to support high availability. Thus, this thesis proposes a new Checkpoint/Restore (C/R) Kubernetes operator to achieve fault tolerance for stateful applications without any modification of the application. The operator takes a checkpoint in a configurable interval. In case of a fault a new application container is created automatically from the most recent checkpoint. We compare the proposed approach with a more conventional approach in which we pull and restore the application state from the application through an API. We measure the overhead of both methods, the service interruption and the recovery time in case of faults. We find the C/R Operator has similar performance in recovery time as the traditional approach, but does not need any application modification. The results signify C/R as a promising technology for a fault tolerance mechanism for stateful applications

    Facilitating Emerging Non-volatile Memories in Next-Generation Memory System Design: Architecture-Level and Application-Level Perspectives

    Get PDF
    This dissertation focuses on three types of emerging NVMs, spin-transfer torque RAM (STT-RAM), phase change memory (PCM), and metal-oxide resistive RAM (ReRAM). STT-RAM has been identified as the best replacement of SRAM to build large-scale and low-power on-chip caches and also an energy-efficient alternative to DRAM as main memory. PCM and ReRAM have been considered to be promising technologies for building future large-scale and low-power main memory systems. This dissertation investigates two aspects to facilitate them in next-generation memory system design, architecture-level and application-level perspectives. First, multi-level cell (MLC) STT-RAM based cache design is optimized by using data encoding and data compression. Second, MLC STT-RAM is utilized as persistent main memory for fast and energy-efficient local checkpointing. Third, the commonly used database indexing algorithm, B+tree, is redesigned to be NVM-friendly. Forth, a novel processing-in-memory architecture built on ReRAM based main memory is proposed to accelerate neural network applications

    Studies in Exascale Computer Architecture: Interconnect, Resiliency, and Checkpointing

    Full text link
    Today’s supercomputers are built from the state-of-the-art components to extract as much performance as possible to solve the most computationally intensive problems in the world. Building the next generation of exascale supercomputers, however, would require re-architecting many of these components to extract over 50x more performance than the current fastest supercomputer in the United States. To contribute towards this goal, two aspects of the compute node architecture were examined in this thesis: the on-chip interconnect topology and the memory and storage checkpointing platforms. As a first step, a skeleton exascale system was modeled to meet 1 exaflop of performance along with 100 petabytes of main memory. The model revealed that large kilo-core processors would be necessary to meet the exaflop performance goal; existing topologies, however, would not scale to those levels. To address this new challenge, we investigated and proposed asymmetric high-radix topologies that decoupled local and global communications and used different radix routers for switching network traffic at each level. The proposed topologies scaled more readily to higher numbers of cores with better latency and energy consumption than before. The vast number of components that the model revealed would be needed in these exascale systems cautioned towards better fault tolerance mechanisms. To address this challenge, we showed that local checkpoints within the compute node can be saved to a hybrid DRAM and SSD platform in order to write them faster without wearing out the SSD or consuming a lot of energy. A hybrid checkpointing platform allowed more frequent checkpoints to be made without sacrificing performance. Subsequently, we proposed switching to a DIMM-based SSD in order to perform fine-grained I/O operations that would be integral in interleaving checkpointing and computation while still providing persistence guarantees. Two more techniques that consolidate and overlap checkpointing were designed to better hide the checkpointing latency to the SSD.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/137096/1/sabeyrat_1.pd
    corecore