1,846 research outputs found
RAM-Efficient External Memory Sorting
In recent years a large number of problems have been considered in external
memory models of computation, where the complexity measure is the number of
blocks of data that are moved between slow external memory and fast internal
memory (also called I/Os). In practice, however, internal memory time often
dominates the total running time once I/O-efficiency has been obtained. In this
paper we study algorithms for fundamental problems that are simultaneously
I/O-efficient and internal memory efficient in the RAM model of computation.Comment: To appear in Proceedings of ISAAC 2013, getting the Best Paper Awar
I/O efficient bisimulation partitioning on very large directed acyclic graphs
In this paper we introduce the first efficient external-memory algorithm to
compute the bisimilarity equivalence classes of a directed acyclic graph (DAG).
DAGs are commonly used to model data in a wide variety of practical
applications, ranging from XML documents and data provenance models, to web
taxonomies and scientific workflows. In the study of efficient reasoning over
massive graphs, the notion of node bisimilarity plays a central role. For
example, grouping together bisimilar nodes in an XML data set is the first step
in many sophisticated approaches to building indexing data structures for
efficient XPath query evaluation. To date, however, only internal-memory
bisimulation algorithms have been investigated. As the size of real-world DAG
data sets often exceeds available main memory, storage in external memory
becomes necessary. Hence, there is a practical need for an efficient approach
to computing bisimulation in external memory.
Our general algorithm has a worst-case IO-complexity of O(Sort(|N| + |E|)),
where |N| and |E| are the numbers of nodes and edges, resp., in the data graph
and Sort(n) is the number of accesses to external memory needed to sort an
input of size n. We also study specializations of this algorithm to common
variations of bisimulation for tree-structured XML data sets. We empirically
verify efficient performance of the algorithms on graphs and XML documents
having billions of nodes and edges, and find that the algorithms can process
such graphs efficiently even when very limited internal memory is available.
The proposed algorithms are simple enough for practical implementation and use,
and open the door for further study of external-memory bisimulation algorithms.
To this end, the full open-source C++ implementation has been made freely
available
External memory priority queues with decrease-key and applications to graph algorithms
We present priority queues in the external memory model with block size B and main memory size M that support on N elements, operation Update (a combination of operations Insert and DecreaseKey) in O(1/Blog_{M/B} N/B) amortized I/Os and operations ExtractMin and Delete in O(ceil[(M^epsilon)/B log_{M/B} N/B] log_{M/B} N/B) amortized I/Os, for any real epsilon in (0,1), using O(N/Blog_{M/B} N/B) blocks. Previous I/O-efficient priority queues either support these operations in O(1/Blog_2 N/B) amortized I/Os [Kumar and Schwabe, SPDP \u2796] or support only operations Insert, Delete and ExtractMin in optimal O(1/Blog_{M/B} N/B) amortized I/Os, however without supporting DecreaseKey [Fadel et al., TCS \u2799].
We also present buffered repository trees that support on a multi-set of N elements, operation Insert in O(1/Blog_M/B N/B) I/Os and operation Extract on K extracted elements in O(M^{epsilon} log_M/B N/B + K/B) amortized I/Os, using O(N/B) blocks. Previous results achieve O(1/Blog_2 N/B) I/Os and O(log_2 N/B + K/B) I/Os, respectively [Buchsbaum et al., SODA \u2700].
Our results imply improved O(E/Blog_{M/B} E/B) I/Os for single-source shortest paths, depth-first search and breadth-first search algorithms on massive directed dense graphs (V,E) with E = Omega (V^(1+epsilon)), epsilon > 0 and V = Omega (M), which is equal to the I/O-optimal bound for sorting E values in external memory
The Lock-free -LSM Relaxed Priority Queue
Priority queues are data structures which store keys in an ordered fashion to
allow efficient access to the minimal (maximal) key. Priority queues are
essential for many applications, e.g., Dijkstra's single-source shortest path
algorithm, branch-and-bound algorithms, and prioritized schedulers.
Efficient multiprocessor computing requires implementations of basic data
structures that can be used concurrently and scale to large numbers of threads
and cores. Lock-free data structures promise superior scalability by avoiding
blocking synchronization primitives, but the \emph{delete-min} operation is an
inherent scalability bottleneck in concurrent priority queues. Recent work has
focused on alleviating this obstacle either by batching operations, or by
relaxing the requirements to the \emph{delete-min} operation.
We present a new, lock-free priority queue that relaxes the \emph{delete-min}
operation so that it is allowed to delete \emph{any} of the smallest
keys, where is a runtime configurable parameter. Additionally, the
behavior is identical to a non-relaxed priority queue for items added and
removed by the same thread. The priority queue is built from a logarithmic
number of sorted arrays in a way similar to log-structured merge-trees. We
experimentally compare our priority queue to recent state-of-the-art lock-free
priority queues, both with relaxed and non-relaxed semantics, showing high
performance and good scalability of our approach.Comment: Short version as ACM PPoPP'15 poste
Lower Bounds for Oblivious Data Structures
An oblivious data structure is a data structure where the memory access
patterns reveals no information about the operations performed on it. Such data
structures were introduced by Wang et al. [ACM SIGSAC'14] and are intended for
situations where one wishes to store the data structure at an untrusted server.
One way to obtain an oblivious data structure is simply to run a classic data
structure on an oblivious RAM (ORAM). Until very recently, this resulted in an
overhead of for the most natural setting of parameters.
Moreover, a recent lower bound for ORAMs by Larsen and Nielsen [CRYPTO'18] show
that they always incur an overhead of at least if used in a
black box manner. To circumvent the overhead, researchers have
instead studied classic data structure problems more directly and have obtained
efficient solutions for many such problems such as stacks, queues, deques,
priority queues and search trees. However, none of these data structures
process operations faster than , leaving open the question of
whether even faster solutions exist. In this paper, we rule out this
possibility by proving lower bounds for oblivious stacks,
queues, deques, priority queues and search trees.Comment: To appear at SODA'1
- …