45 research outputs found
The Impact of Global Clustering on Spatial Database Systems
Global clustering has rarely been investigated in
the area of spatial database systems although dramatic
performance improvements can be
achieved by using suitable techniques. In this paper,
we propose a simple approach to global clustering
called cluster organization. We will demonstrate
that this cluster organization leads to considerable
performance improvements without any
algorithmic overhead. Based on real geographic
data, we perform a detailed empirical performance
evaluation and compare the cluster organization
to other organization models not using global
clustering. We will show that global clustering
speeds up the processing of window queries as
well as spatial joins without decreasing the performance
of the insertion of new objects and of selective
queries such as point queries. The spatial
join is sped up by a factor of about 4, whereas
non-selective window queries are accelerated by
even higher speed up factors
Grid File Approach to Large Multidimensional Dynamic Data Structures
Computing and Information Science
Multidimensional Range Queries on Modern Hardware
Range queries over multidimensional data are an important part of database
workloads in many applications. Their execution may be accelerated by using
multidimensional index structures (MDIS), such as kd-trees or R-trees. As for
most index structures, the usefulness of this approach depends on the
selectivity of the queries, and common wisdom told that a simple scan beats
MDIS for queries accessing more than 15%-20% of a dataset. However, this wisdom
is largely based on evaluations that are almost two decades old, performed on
data being held on disks, applying IO-optimized data structures, and using
single-core systems. The question is whether this rule of thumb still holds
when multidimensional range queries (MDRQ) are performed on modern
architectures with large main memories holding all data, multi-core CPUs and
data-parallel instruction sets. In this paper, we study the question whether
and how much modern hardware influences the performance ratio between index
structures and scans for MDRQ. To this end, we conservatively adapted three
popular MDIS, namely the R*-tree, the kd-tree, and the VA-file, to exploit
features of modern servers and compared their performance to different flavors
of parallel scans using multiple (synthetic and real-world) analytical
workloads over multiple (synthetic and real-world) datasets of varying size,
dimensionality, and skew. We find that all approaches benefit considerably from
using main memory and parallelization, yet to varying degrees. Our evaluation
indicates that, on current machines, scanning should be favored over parallel
versions of classical MDIS even for very selective queries
New generic indexing technology
There has been no fundamental change in the dynamic indexing methods supporting database systems since the invention of the B-tree twenty-five years ago. And yet the whole classical approach to dynamic database indexing has long since become inappropriate and increasingly inadequate. We are moving rapidly from the conventional one-dimensional world of fixed-structure text and numbers to a multi-dimensional world of variable structures, objects and images, in space and time. But, even before leaving the confines of conventional database indexing, the situation is highly unsatisfactory. In fact, our research has led us to question the basic assumptions of conventional database indexing. We have spent the past ten years studying the properties of multi-dimensional indexing methods, and in this paper we draw the strands of a number of developments together - some quite old, some very new, to show how we now have the basis for a new generic indexing technology for the next generation of database systems
Setup-Free Secure Search on Encrypted Data: Faster and Post-Processing Free
We present a novel protocol on data and queries encrypted with Fully Homomorphic Encryption (FHE).
Our protocol enables organizations (client) to (1) securely upload an unsorted data array to an untrusted honest-but-curious sever, where data may be uploaded over time and from multiple data-sources; and (2) securely issue repeated search queries for retrieving the first element satisfying an agreed matching criterion , as well as fetching the next matching elements with further interaction.
For security, the client encrypts the data and queries with FHE prior to uploading, and the server processes the ciphertexts to produce the result ciphertext for the client to decrypt.
Our secure search protocol improves over the prior state-of-the-art for secure search on FHE encrypted data (Akavia, Feldman, Shaul (AFS), CCS\u272018) in achieving:
(1) protocol where the server produces a ciphertext for the correct search outcome with overwhelming success probability.This is in contrast to returning a list of candidates for the client to post-process, or suffering from a noticeable error probability, in AFS. Our post-processing freeness enables the server to use secure search as a sub-component in a larger computation without interaction with the client.
(2) (a) Client time and communication bandwidth are improved by a factor. (b) Server evaluates a polynomial of degree linear in (compare to cubic in AFS), and overall number of multiplications improved by up to factor.(c) Employing only computations (compare to for in AFS) to gain both further speedup and compatibility to all current FHE candidates.
(3) we executed on identical hardware for implementations of ours versus AFS\u27s protocols.
Additionally, like other FHE based solutions, out solution is setup-free: to outsource elements from the client to the server, no additional actions are performed on except for encrypting it element by element (each element bit by bit) and uploading the resulted ciphertexts to the server
Stateful Multi-Client Verifiable Computation
This paper develops a cryptographic protocol for outsourcing arbitrary stateful computation among multiple clients to an untrusted server, while guaranteeing integrity of the data. The clients communicate only with the server and store only a short authenticator to ensure that the server does not cheat.
Our contribution is two-fold. First, we extend the recent hash&prove scheme of Fiore et al. (CCS 2016) to stateful computations that support arbitrary updates by the untrusted server, in a way that can be verified by the clients. We use this scheme to generically instantiate authenticated data types. Second, we describe a protocol for multi-client verifiable computation based on an authenticated data type, and prove that it achieves a computational version of fork linearizability. This is the strongest guarantee that can be achieved in the setting where clients do not communicate directly; it ensures correctness and consistency of outputs seen by the clients individually
Scalability analysis of declustering methods for multidimensional range queries
Abstract—Efficient storage and retrieval of multiattribute data sets has become one of the essential requirements for many data-intensive applications. The Cartesian product file has been known as an effective multiattribute file structure for partial-match and best-match queries. Several heuristic methods have been developed to decluster Cartesian product files across multiple disks to obtain high performance for disk accesses. Although the scalability of the declustering methods becomes increasingly important for systems equipped with a large number of disks, no analytic studies have been done so far. In this paper, we derive formulas describing the scalability of two popular declustering methods¦Disk Modulo and Fieldwise Xor¦for range queries, which are the most common type of queries. These formulas disclose the limited scalability of the declustering methods, and this is corroborated by extensive simulation experiments. From the practical point of view, the formulas given in this paper provide a simple measure that can be used to predict the response time of a given range query and to guide the selection of a declustering method under various conditions