7 research outputs found

    Endurable Transient Inconsistency in Byte-Addressable Persistent B+-Tree

    Get PDF
    Department of Computer Science and EngineeringWith the emergence of byte-addressable persistent memory (PM), a cache line, instead of a page, is expected to be the unit of data transfer between volatile and non-volatile devices, but the failure-atomicity of write operations is guaranteed in the granularity of 8 bytes rather than cache lines. This granularity mismatch problem has generated interest in redesigning block-based data structures such as B+-trees. However, various methods of modifying B+-trees for PM degrade the efficiency of B+-trees, and attempts have been made to use in-memory data structures for PM. In this study, we develop Failure-Atomic ShifT (FAST) and Failure-Atomic In-place Rebalance (FAIR) algorithms to resolve the granularity mismatch problem. Every 8-byte store instruction used in the FAST and FAIR algorithms transforms a B+-tree into another consistent state or a {\it transient inconsistent} state that read operations can tolerate. By making read operations tolerate transient inconsistency, we can avoid expensive copy-on-write, logging, and even the necessity of read latches so that read transactions can be non-blocking. Our experimental results show that legacy B+-trees with FAST and FAIR schemes outperform the state-of-the-art persistent indexing structures by a large margin.clos

    EM-KDE: A locality-aware job scheduling policy with distributed semantic caches

    No full text
    In modern query processing systems, the caching facilities are distributed and scale with the number of servers. To maximize the overall system throughput, the distributed system should balance the query loads among servers and also leverage cached results. In particular, leveraging distributed cached data is becoming more important as many systems are being built by connecting many small heterogeneous machines rather than relying on a few high-performance workstations. Although many query scheduling policies exist such as round-robin and load-monitoring, they are not sophisticated enough to both balance the load and leverage cached results. In this paper, we propose distributed query scheduling policies that take into account the dynamic contents of distributed caching infrastructure and employ statistical prediction methods into query scheduling policy. We employ the kernel density estimation derived from recent queries and the well-known exponential moving average (EMA) in order to predict the query distribution in a multi-dimensional problem space that dynamically changes. Based on the estimated query distribution, the front-end scheduler assigns incoming queries so that query workloads are balanced and cached results are reused. Our experiments show that the proposed query scheduling policy outperforms existing policies in terms of both load balancing and cache hit ratio. (C) 2015 Elsevier Inc. All rights reservedclose0

    EclipseMR: Distributed and Parallel Task Processing with Consistent Hashing

    No full text
    We present EclipseMR, a novel MapReduce framework prototype that efficiently utilizes a large distributed memory in cluster environments. EclipseMR consists of double-layered consistent hash rings - a decentralized DHT-based file system and an in-memory key-value store that employs consistent hashing. The in-memory key-value store in EclipseMR is designed not only to cache local data but also remote data as well so that globally popular data can be distributed across cluster servers and found by consistent hashing. In order to leverage large distributed memories and increase the cache hit ratio, we propose a locality-aware fair (LAF) job scheduler that works as the load balancer for the distributed in-memory caches. Based on hash keys, the LAF job scheduler predicts which servers have reusable data, and assigns tasks to the servers so that they can be reused. The LAF job scheduler makes its best efforts to strike a balance between data locality and load balance, which often conflict with each other. We evaluate EclipseMR by quantifying the performance effect of each component using several representative MapReduce applications and show EclipseMR is faster than Hadoop and Spark by a large margin for various applications

    High-throughput query scheduling with spatial clustering based on distributed exponential moving average

    No full text
    In distributed scientific query processing systems, leveraging distributed cached data is becoming more important. In such systems, a front-end query scheduler distributes queries among many application servers rather than processing queries in a few high-performance workstations. Although many query scheduling policies exist such as round-robin and load-monitoring, they are not sophisticated enough to exploit cached results as well as balance the workload. Efforts were made to improve the query processing performance using statistical methods such as exponential moving average. However, existing methods have limitations for certain query patterns: queries with hotspots, or dynamic query distributions. In this paper, we propose novel query scheduling policies that take into account both the contents of distributed caching infrastructure and the load balance among the servers. Our experiments show that the proposed query scheduling policies outperform existing policies by producing better query plans in terms of load balance and cache-hit ratio.close

    B-3-Tree: Byte-Addressable Binary B-Tree for Persistent Memory

    No full text
    In this work, we propose B-3-tree, a hybrid index for persistent memory that leverages the byte-addressability of the in-memory index and the page locality of B-trees. As in the byte-addressable in-memory index, B-3-tree is updated by 8-byte store instructions. Also, as in disk-based index, B-3-tree is failure-atomic since it makes every 8-byte store instruction transform a consistent index into another consistent index without the help of expensive logging. Since expensive logging becomes unnecessary, the number of cacheline flush instructions required for B-3-tree is significantly reduced. Our performance study shows that B-3-tree outperforms other state-of-the-art persistent indexes in terms of insert and delete performance. While B-3-tree shows slightly worse performance for point query performance, the range query performance of B-3-tree is 2x faster than FAST and FAIR B-tree because the leaf page size of B-3-tree can be set to 8x larger than that of FAST and FAIR B-tree without degrading insertion performance. We also show that read transactions can access B-3- tree without acquiring a shared lock because B-3-tree remains always consistent while a sequence of 8-byte write operations are making changes to it. As a result, B-3-tree provides high concurrency level comparable to FAST and FAIR B-tree
    corecore