882 research outputs found
I/O-optimal algorithms on grid graphs
Given a graph of which the n vertices form a regular two-dimensional grid,
and in which each (possibly weighted and/or directed) edge connects a vertex to
one of its eight neighbours, the following can be done in O(scan(n)) I/Os,
provided M = Omega(B^2): computation of shortest paths with non-negative edge
weights from a single source, breadth-first traversal, computation of a minimum
spanning tree, topological sorting, time-forward processing (if the input is a
plane graph), and an Euler tour (if the input graph is a tree). The
minimum-spanning tree algorithm is cache-oblivious. The best previously
published algorithms for these problems need Theta(sort(n)) I/Os. Estimates of
the actual I/O volume show that the new algorithms may often be very efficient
in practice.Comment: 12 pages' extended abstract plus 12 pages' appendix with details,
proofs and calculations. Has not been published in and is currently not under
review of any conference or journa
Lower Bounds for Oblivious Near-Neighbor Search
We prove an lower bound on the dynamic
cell-probe complexity of statistically
approximate-near-neighbor search () over the -dimensional
Hamming cube. For the natural setting of , our result
implies an lower bound, which is a quadratic
improvement over the highest (non-oblivious) cell-probe lower bound for
. This is the first super-logarithmic
lower bound for against general (non black-box) data structures.
We also show that any oblivious data structure for
decomposable search problems (like ) can be obliviously dynamized
with overhead in update and query time, strengthening a classic
result of Bentley and Saxe (Algorithmica, 1980).Comment: 28 page
RAM-Efficient External Memory Sorting
In recent years a large number of problems have been considered in external
memory models of computation, where the complexity measure is the number of
blocks of data that are moved between slow external memory and fast internal
memory (also called I/Os). In practice, however, internal memory time often
dominates the total running time once I/O-efficiency has been obtained. In this
paper we study algorithms for fundamental problems that are simultaneously
I/O-efficient and internal memory efficient in the RAM model of computation.Comment: To appear in Proceedings of ISAAC 2013, getting the Best Paper Awar
Lower Bounds for Oblivious Data Structures
An oblivious data structure is a data structure where the memory access
patterns reveals no information about the operations performed on it. Such data
structures were introduced by Wang et al. [ACM SIGSAC'14] and are intended for
situations where one wishes to store the data structure at an untrusted server.
One way to obtain an oblivious data structure is simply to run a classic data
structure on an oblivious RAM (ORAM). Until very recently, this resulted in an
overhead of for the most natural setting of parameters.
Moreover, a recent lower bound for ORAMs by Larsen and Nielsen [CRYPTO'18] show
that they always incur an overhead of at least if used in a
black box manner. To circumvent the overhead, researchers have
instead studied classic data structure problems more directly and have obtained
efficient solutions for many such problems such as stacks, queues, deques,
priority queues and search trees. However, none of these data structures
process operations faster than , leaving open the question of
whether even faster solutions exist. In this paper, we rule out this
possibility by proving lower bounds for oblivious stacks,
queues, deques, priority queues and search trees.Comment: To appear at SODA'1
Configurable Strategies for Work-stealing
Work-stealing systems are typically oblivious to the nature of the tasks they
are scheduling. For instance, they do not know or take into account how long a
task will take to execute or how many subtasks it will spawn. Moreover, the
actual task execution order is typically determined by the underlying task
storage data structure, and cannot be changed. There are thus possibilities for
optimizing task parallel executions by providing information on specific tasks
and their preferred execution order to the scheduling system.
We introduce scheduling strategies to enable applications to dynamically
provide hints to the task-scheduling system on the nature of specific tasks.
Scheduling strategies can be used to independently control both local task
execution order as well as steal order. In contrast to conventional scheduling
policies that are normally global in scope, strategies allow the scheduler to
apply optimizations on individual tasks. This flexibility greatly improves
composability as it allows the scheduler to apply different, specific
scheduling choices for different parts of applications simultaneously. We
present a number of benchmarks that highlight diverse, beneficial effects that
can be achieved with scheduling strategies. Some benchmarks (branch-and-bound,
single-source shortest path) show that prioritization of tasks can reduce the
total amount of work compared to standard work-stealing execution order. For
other benchmarks (triangle strip generation) qualitatively better results can
be achieved in shorter time. Other optimizations, such as dynamic merging of
tasks or stealing of half the work, instead of half the tasks, are also shown
to improve performance. Composability is demonstrated by examples that combine
different strategies, both within the same kernel (prefix sum) as well as when
scheduling multiple kernels (prefix sum and unbalanced tree search)
Exploiting non-constant safe memory in resilient algorithms and data structures
We extend the Faulty RAM model by Finocchi and Italiano (2008) by adding a
safe memory of arbitrary size , and we then derive tradeoffs between the
performance of resilient algorithmic techniques and the size of the safe
memory. Let and denote, respectively, the maximum amount of
faults which can happen during the execution of an algorithm and the actual
number of occurred faults, with . We propose a resilient
algorithm for sorting entries which requires time and uses safe memory words. Our
algorithm outperforms previous resilient sorting algorithms which do not
exploit the available safe memory and require time. Finally, we exploit our sorting algorithm for
deriving a resilient priority queue. Our implementation uses safe
memory words and faulty memory words for storing keys, and
requires amortized time for each insert and
deletemin operation. Our resilient priority queue improves the amortized time required by the state of the art.Comment: To appear in Theoretical Computer Science, 201
OutFlank Routing: Increasing Throughput in Toroidal Interconnection Networks
We present a new, deadlock-free, routing scheme for toroidal interconnection
networks, called OutFlank Routing (OFR). OFR is an adaptive strategy which
exploits non-minimal links, both in the source and in the destination nodes.
When minimal links are congested, OFR deroutes packets to carefully chosen
intermediate destinations, in order to obtain travel paths which are only an
additive constant longer than the shortest ones. Since routing performance is
very sensitive to changes in the traffic model or in the router parameters, an
accurate discrete-event simulator of the toroidal network has been developed to
empirically validate OFR, by comparing it against other relevant routing
strategies, over a range of typical real-world traffic patterns. On the
16x16x16 (4096 nodes) simulated network OFR exhibits improvements of the
maximum sustained throughput between 14% and 114%, with respect to Adaptive
Bubble Routing.Comment: 9 pages, 5 figures, to be presented at ICPADS 201
- …