48 research outputs found
QUASII: QUery-Aware Spatial Incremental Index.
With large-scale simulations of increasingly detailed models and improvement of data acquisition technologies, massive amounts of data are easily and quickly created and collected. Traditional systems require indexes to be built before analytic queries can be executed efficiently. Such an indexing step requires substantial computing resources and introduces a considerable and growing data-to-insight gap where scientists need to wait before they can perform any analysis. Moreover, scientists often only use a small fraction of the data - the parts containing interesting phenomena - and indexing it fully does not always pay off. In this paper we develop a novel incremental index for the exploration of spatial data. Our approach, QUASII, builds a data-oriented index as a side-effect of query execution. QUASII distributes the cost of indexing across all queries, while building the index structure only for the subset of data queried. It reduces data-to-insight time and curbs the cost of incremental indexing by gradually and partially sorting the data, while producing a data-oriented hierarchical structure at the same time. As our experiments show, QUASII reduces the data-to-insight time by up to a factor of 11.4x, while its performance converges to that of the state-of-the-art static indexes
Speedy Transactions in Multicore In-Memory Databases
Silo is a new in-memory database that achieves excellent performance and scalability on modern multicore machines. Silo was designed from the ground up to use system memory and caches efficiently. For instance, it avoids all centralized contention points, including that of centralized transaction ID assignment. Silo's key contribution is a commit protocol based on optimistic concurrency control that provides serializability while avoiding all shared-memory writes for records that were only read. Though this might seem to complicate the enforcement of a serial order, correct logging and recovery is provided by linking periodically-updated epochs with the commit protocol. Silo provides the same guarantees as any serializable database without unnecessary scalability bottlenecks or much additional latency. Silo achieves almost 700,000 transactions per second on a standard TPC-C workload mix on a 32-core machine, as well as near-linear scalability. Considered per core, this is several times higher than previously reported results.Engineering and Applied Science
An Update-intensive LSM-based R-tree Index
Many applications require update-intensive workloads on spatial objects,
e.g., social-network services and shared-riding services that track moving
objects. By buffering insert and delete operations in memory, the Log
Structured Merge Tree (LSM) has been used widely in various systems because of
its ability to handle write-heavy workloads. While the focus on LSM has been on
key-value stores and their optimizations, there is a need to study how to
efficiently support LSM-based {\em secondary} indexes (e.g., location-based
indexes) as modern, heterogeneous data necessitates the use of secondary
indexes. In this paper, we investigate the augmentation of a main-memory-based
memo structure into an LSM secondary index structure to handle update-intensive
workloads efficiently. We conduct this study in the context of an R-tree-based
secondary index. In particular, we introduce the LSM RUM-tree that demonstrates
the use of an Update Memo in an LSM-based R-tree to enhance the performance of
the R-tree's insert, delete, update, and search operations. The LSM RUM-tree
introduces new strategies to control the size of the Update Memo to make sure
it always fits in memory for high performance. The Update Memo is a
light-weight in-memory structure that is suitable for handling update-intensive
workloads without introducing significant overhead. Experimental results using
real spatial data demonstrate that the LSM RUM-tree achieves up to 9.6x speedup
on update operations and up to 2400x speedup on query processing over existing
LSM R-tree implementations
F2: Designing a Key-Value Store for Large Skewed Workloads
Today's key-value stores are either disk-optimized, focusing on large data
and saturating device IOPS, or memory-optimized, focusing on high throughput
with linear thread scaling assuming plenty of main memory. However, many
practical workloads demand high performance for read and write working sets
that are much larger than main memory, over a total data size that is even
larger. They require judicious use of memory and disk, and today's systems do
not handle such workloads well. We present F2, a new key-value store design
based on compartmentalization -- it consists of five key components that work
together in well-defined ways to achieve high throughput -- saturating disk and
memory bandwidths -- while incurring low disk read and write amplification. A
key design characteristic of F2 is that it separates the management of hot and
cold data, across the read and write domains, and adapts the use of memory to
optimize each case. Through a sequence of new latch-free system constructs, F2
solves the key challenge of maintaining high throughput with linear thread
scalability in such a compartmentalized design. Detailed experiments on
benchmark data validate our design's superiority, in terms of throughput, over
state-of-the-art key-value stores, when the available memory resources are
scarce
SALI: A Scalable Adaptive Learned Index Framework based on Probability Models
The growth in data storage capacity and the increasing demands for high
performance have created several challenges for concurrent indexing structures.
One promising solution is learned indexes, which use a learning-based approach
to fit the distribution of stored data and predictively locate target keys,
significantly improving lookup performance. Despite their advantages,
prevailing learned indexes exhibit constraints and encounter issues of
scalability on multi-core data storage.
This paper introduces SALI, the Scalable Adaptive Learned Index framework,
which incorporates two strategies aimed at achieving high scalability,
improving efficiency, and enhancing the robustness of the learned index.
Firstly, a set of node-evolving strategies is defined to enable the learned
index to adapt to various workload skews and enhance its concurrency
performance in such scenarios. Secondly, a lightweight strategy is proposed to
maintain statistical information within the learned index, with the goal of
further improving the scalability of the index. Furthermore, to validate their
effectiveness, SALI applied the two strategies mentioned above to the learned
index structure that utilizes fine-grained write locks, known as LIPP. The
experimental results have demonstrated that SALI significantly enhances the
insertion throughput with 64 threads by an average of 2.04x compared to the
second-best learned index. Furthermore, SALI accomplishes a lookup throughput
similar to that of LIPP+.Comment: Accepted by Conference SIGMOD 24, June 09-15, 2024, Santiago, Chil
Spatial Join with R-Tree on Graphics Processing Units
Spatial operations such as spatial join combine two objects on spatial predicates. It is different from relational join because objects have multi dimensions and spatial join consumes large execution time. Recently, many researches tried to find methods to improve the execution time. Parallel spatial join is one method to improve the execution time. Comparison between objects can be done in parallel. Spatial datasets are large. R-Tree data structure can improve the performance of spatial join.In this paper, a parallel spatial join on Graphic processor unit (GPU) is introduced. The capacity of GPU which has many processors to accelerate the computation is exploited. The experiment is carried out to compare the spatial join between a sequential implementation with C language on CPU and a parallel implementation with CUDA C language on GPU. The result shows that the spatial join on GPU is faster than on a conventional processor
One stone, two birds: A lightweight multidimensional learned index with cardinality support
Innovative learning based structures have recently been proposed to tackle
index and cardinality estimation tasks, specifically learned indexes and data
driven cardinality estimators. These structures exhibit excellent performance
in capturing data distribution, making them promising for integration into AI
driven database kernels. However, accurate estimation for corner case queries
requires a large number of network parameters, resulting in higher computing
resources on expensive GPUs and more storage overhead. Additionally, the
separate implementation for CE and learned index result in a redundancy waste
by storage of single table distribution twice. These present challenges for
designing AI driven database kernels. As in real database scenarios, a compact
kernel is necessary to process queries within a limited storage and time
budget. Directly integrating these two AI approaches would result in a heavy
and complex kernel due to a large number of network parameters and repeated
storage of data distribution parameters. Our proposed CardIndex structure
effectively killed two birds with one stone. It is a fast multidim learned
index that also serves as a lightweight cardinality estimator with parameters
scaled at the KB level. Due to its special structure and small parameter size,
it can obtain both CDF and PDF information for tuples with an incredibly low
latency of 1 to 10 microseconds. For tasks with low selectivity estimation, we
did not increase the model's parameters to obtain fine grained point density.
Instead, we fully utilized our structure's characteristics and proposed a
hybrid estimation algorithm in providing fast and exact results
FPTree: A Hybrid SCM-DRAM Persistent and Concurrent B-Tree for Storage Class Memory
The advent of Storage Class Memory (SCM) is driving a rethink of storage systems towards a single-level architecture where memory and storage are merged. In this context, several works have investigated how to design persistent trees in SCM as a fundamental building block for these novel systems. However, these trees are significantly slower than DRAM-based counterparts since trees are latency-sensitive and SCM exhibits higher latencies than DRAM. In this paper we propose a novel hybrid SCM-DRAM persistent and concurrent B-Tree, named Fingerprinting Persistent Tree (FPTree) that achieves similar performance to DRAM-based counterparts. In this novel design, leaf nodes are persisted in SCM while inner nodes are placed in DRAM and rebuilt upon recovery. The FPTree uses Fingerprinting, a technique that limits the expected number of in-leaf probed keys to one. In addition, we propose a hybrid concurrency scheme for the FPTree that is partially based on Hardware Transactional Memory. We conduct a thorough performance evaluation and show that the FPTree outperforms state-of-the-art persistent trees with different SCM latencies by up to a factor of 8.2. Moreover, we show that the FPTree scales very well on a machine with 88 logical cores. Finally, we integrate the evaluated trees in memcached and a prototype database. We show that the FPTree incurs an almost negligible performance overhead over using fully transient data structures, while significantly outperforming other persistent trees
Enhance Connectivity of Promising Regions for Sampling-based Path Planning
Sampling-based path planning algorithms usually implement uniform sampling
methods to search the state space. However, uniform sampling may lead to
unnecessary exploration in many scenarios, such as the environment with a few
dead ends. Our previous work proposes to use the promising region to guide the
sampling process to address the issue. However, the predicted promising regions
are often disconnected, which means they cannot connect the start and goal
state, resulting in a lack of probabilistic completeness. This work focuses on
enhancing the connectivity of predicted promising regions. Our proposed method
regresses the connectivity probability of the edges in the x and y directions.
In addition, it calculates the weight of the promising edges in loss to guide
the neural network to pay more attention to the connectivity of the promising
regions. We conduct a series of simulation experiments, and the results show
that the connectivity of promising regions improves significantly. Furthermore,
we analyze the effect of connectivity on sampling-based path planning
algorithms and conclude that connectivity plays an essential role in
maintaining algorithm performance.Comment: Accepted in Transactions on Automation Science and Engineering, 202