15 research outputs found

    Improving Index Performance through Prefetching

    Full text link

    Cache craftiness for fast multicore key-value storage

    Get PDF
    We present Masstree, a fast key-value database designed for SMP machines. Masstree keeps all data in memory. Its main data structure is a trie-like concatenation of B+-trees, each of which handles a fixed-length slice of a variable-length key. This structure effectively handles arbitrary-length possiblybinary keys, including keys with long shared prefixes. [superscript +]-tree fanout was chosen to minimize total DRAM delay when descending the tree and prefetching each tree node. Lookups use optimistic concurrency control, a read-copy-update-like technique, and do not write shared data structures; updates lock only affected nodes. Logging and checkpointing provide consistency and durability. Though some of these ideas appear elsewhere, Masstree is the first to combine them. We discuss design variants and their consequences. On a 16-core machine, with logging enabled and queries arriving over a network, Masstree executes more than six million simple queries per second. This performance is comparable to that of memcached, a non-persistent hash table server, and higher (often much higher) than that of VoltDB, MongoDB, and Redis.National Science Foundation (U.S.). (Award 0834415)National Science Foundation (U.S.). (Award 0915164)Quanta Computer (Firm

    STEPS Towards Cache-Resident Transaction Processing

    Get PDF

    Partial Replica Location And Selection For Spatial Datasets

    Get PDF
    As the size of scientific datasets continues to grow, we will not be able to store enormous datasets on a single grid node, but must distribute them across many grid nodes. The implementation of partial or incomplete replicas, which represent only a subset of a larger dataset, has been an active topic of research. Partial Spatial Replicas extend this functionality to spatial data, allowing us to distribute a spatial dataset in pieces over several locations. We investigate solutions to the partial spatial replica selection problems. First, we describe and develop two designs for an Spatial Replica Location Service (SRLS), which must return the set of replicas that intersect with a query region. Integrating a relational database, a spatial data structure and grid computing software, we build a scalable solution that works well even for several million replicas. In our SRLS, we have improved performance by designing a R-tree structure in the backend database, and by aggregating several queries into one larger query, which reduces overhead. We also use the Morton Space-filling Curve during R-tree construction, which improves spatial locality. In addition, we describe R-tree Prefetching(RTP), which effectively utilizes the modern multi-processor architecture. Second, we present and implement a fast replica selection algorithm in which a set of partial replicas is chosen from a set of candidates so that retrieval performance is maximized. Using an R-tree based heuristic algorithm, we achieve O(n log n) complexity for this NP-complete problem. We describe a model for disk access performance that takes filesystem prefetching into account and is sufficiently accurate for spatial replica selection. Making a few simplifying assumptions, we present a fast replica selection algorithm for partial spatial replicas. The algorithm uses a greedy approach that attempts to maximize performance by choosing a collection of replica subsets that allow fast data retrieval by a client machine. Experiments show that the performance of the solution found by our algorithm is on average always at least 91% and 93.4% of the performance of the optimal solution in 4-node and 8-node tests respectively

    Engineering Parallel Bi-Criteria Shortest Path Search

    Get PDF

    Creating a Concurrent In-Memory B-Tree Optimized for NUMA Systems

    Get PDF
    The size of main memory is becoming larger. With the number of Central Processing Unit (CPU) cores ever increasing in modern systems, with each of them being able to access memory, the organization of memory becomes more important. In multicore systems, there are two main architectures for memory organization with respect to the cores - Symmetric Multi-Processor (SMP) and Non-Uniform Memory Architecture (NUMA). Prior work has focused on the improvement of the performance of B-Trees in highly concurrent and distributed environments, as well as in memory, for shared-memory mul- tiprocessors. However, little focus has been given to the performance of main memory B-Trees for NUMA systems. This work focuses on improving the performance of B-Trees contained in main memory of NUMA systems by introducing modifications that consider its storage in the physically distributed main memory of the NUMA system. The work in this thesis makes the following contributions to the development of a distributed B-Tree, specifically in a NUMA environment, modified from a B-Tree originally designed for high concurrency: • It introduces replication of internal nodes of the tree and shows how this can improve its overall performance in a NUMA environment. • It introduces NUMA-aware locking procedures with the aim of managing contention and exploiting locality of lock requests with reference to previous client operation request locations. • It introduces changes in the granularity of locking, starting from the original locking of every node to the locking of certain levels of nodes, showing the tradeoff between the granularity of locking and the performance of the tree based on the workload. • It considers the combination of the different techniques, with the aim of finding the combination which performs well overall for varying read-heavy workloads and number of client threads
    corecore