6,064 research outputs found
Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems
Two emerging hardware trends will dominate the database system technology in
the near future: increasing main memory capacities of several TB per server and
massively parallel multi-core processing. Many algorithmic and control
techniques in current database technology were devised for disk-based systems
where I/O dominated the performance. In this work we take a new look at the
well-known sort-merge join which, so far, has not been in the focus of research
in scalable massively parallel multi-core data processing as it was deemed
inferior to hash joins. We devise a suite of new massively parallel sort-merge
(MPSM) join algorithms that are based on partial partition-based sorting.
Contrary to classical sort-merge joins, our MPSM algorithms do not rely on a
hard to parallelize final merge step to create one complete sort order. Rather
they work on the independently created runs in parallel. This way our MPSM
algorithms are NUMA-affine as all the sorting is carried out on local memory
partitions. An extensive experimental evaluation on a modern 32-core machine
with one TB of main memory proves the competitive performance of MPSM on large
main memory databases with billions of objects. It scales (almost) linearly in
the number of employed cores and clearly outperforms competing hash join
proposals - in particular it outperforms the "cutting-edge" Vectorwise parallel
query engine by a factor of four.Comment: VLDB201
Aggregating over Dominated Points by Sorting, Scanning, Zip and Flat Maps
Prefix aggregation operation (also called scan), and its particular case, prefix summation, is an important parallel primitive and enjoys a lot of attention in the research literature. It is also used in many algorithms as one of the steps.
Aggregation over dominated points in ?^m is a multidimensional generalisation of prefix aggregation. It is also intensively researched, both as a parallel primitive and as a practical problem, encountered in computational geometry, spatial databases and data warehouses.
In this paper we show that, for a constant dimension m, aggregation over dominated points in ?^m can be computed by O(1) basic operations that include sorting the whole dataset, zipping sorted lists of elements, computing prefix aggregations of lists of elements and flat maps, which expand the data size from initial n to n log^{m-1}n.
Thereby we establish that prefix aggregation suffices to express aggregation over dominated points in more dimensions, even though the latter is a far-reaching generalisation of the former. Many problems known to be expressible by aggregation over dominated points become expressible by prefix aggregation, too.
We rely on a small set of primitive operations which guarantee an easy transfer to various distributed architectures and some desired properties of the implementation
GraphSE: An Encrypted Graph Database for Privacy-Preserving Social Search
In this paper, we propose GraphSE, an encrypted graph database for online
social network services to address massive data breaches. GraphSE preserves
the functionality of social search, a key enabler for quality social network
services, where social search queries are conducted on a large-scale social
graph and meanwhile perform set and computational operations on user-generated
contents. To enable efficient privacy-preserving social search, GraphSE
provides an encrypted structural data model to facilitate parallel and
encrypted graph data access. It is also designed to decompose complex social
search queries into atomic operations and realise them via interchangeable
protocols in a fast and scalable manner. We build GraphSE with various
queries supported in the Facebook graph search engine and implement a
full-fledged prototype. Extensive evaluations on Azure Cloud demonstrate that
GraphSE is practical for querying a social graph with a million of users.Comment: This is the full version of our AsiaCCS paper "GraphSE: An
Encrypted Graph Database for Privacy-Preserving Social Search". It includes
the security proof of the proposed scheme. If you want to cite our work,
please cite the conference version of i
- …