2,020 research outputs found
Optimal Hierarchical Layouts for Cache-Oblivious Search Trees
This paper proposes a general framework for generating cache-oblivious
layouts for binary search trees. A cache-oblivious layout attempts to minimize
cache misses on any hierarchical memory, independent of the number of memory
levels and attributes at each level such as cache size, line size, and
replacement policy. Recursively partitioning a tree into contiguous subtrees
and prescribing an ordering amongst the subtrees, Hierarchical Layouts
generalize many commonly used layouts for trees such as in-order, pre-order and
breadth-first. They also generalize the various flavors of the van Emde Boas
layout, which have previously been used as cache-oblivious layouts.
Hierarchical Layouts thus unify all previous attempts at deriving layouts for
search trees.
The paper then derives a new locality measure (the Weighted Edge Product)
that mimics the probability of cache misses at multiple levels, and shows that
layouts that reduce this measure perform better. We analyze the various degrees
of freedom in the construction of Hierarchical Layouts, and investigate the
relative effect of each of these decisions in the construction of
cache-oblivious layouts. Optimizing the Weighted Edge Product for complete
binary search trees, we introduce the MinWEP layout, and show that it
outperforms previously used cache-oblivious layouts by almost 20%.Comment: Extended version with proofs added to the appendi
I/O-optimal algorithms on grid graphs
Given a graph of which the n vertices form a regular two-dimensional grid,
and in which each (possibly weighted and/or directed) edge connects a vertex to
one of its eight neighbours, the following can be done in O(scan(n)) I/Os,
provided M = Omega(B^2): computation of shortest paths with non-negative edge
weights from a single source, breadth-first traversal, computation of a minimum
spanning tree, topological sorting, time-forward processing (if the input is a
plane graph), and an Euler tour (if the input graph is a tree). The
minimum-spanning tree algorithm is cache-oblivious. The best previously
published algorithms for these problems need Theta(sort(n)) I/Os. Estimates of
the actual I/O volume show that the new algorithms may often be very efficient
in practice.Comment: 12 pages' extended abstract plus 12 pages' appendix with details,
proofs and calculations. Has not been published in and is currently not under
review of any conference or journa
Parallel Minimum Cuts in Near-linear Work and Low Depth
We present the first near-linear work and poly-logarithmic depth algorithm
for computing a minimum cut in a graph, while previous parallel algorithms with
poly-logarithmic depth required at least quadratic work in the number of
vertices. In a graph with vertices and edges, our algorithm computes
the correct result with high probability in work and
depth. This result is obtained by parallelizing a data
structure that aggregates weights along paths in a tree and by exploiting the
connection between minimum cuts and approximate maximum packings of spanning
trees. In addition, our algorithm improves upon bounds on the number of cache
misses incurred to compute a minimum cut
04091 Abstracts Collection -- Data Structures
From 22.02. to 27.02.2004, Dagstuhl Seminar "Data Structures"
was held in the International Conference and Research Center (IBFI),
Schloss Dagstuhl.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed.
Abstracts of the presentations given during the seminar
are put together in this paper. The first section
describes the seminar topics and goals in general
Analysis of Power-aware Buffering Schemes in Wireless Sensor Networks
We study the power-aware buffering problem in battery-powered sensor
networks, focusing on the fixed-size and fixed-interval buffering schemes. The
main motivation is to address the yet poorly understood size variation-induced
effect on power-aware buffering schemes. Our theoretical analysis elucidates
the fundamental differences between the fixed-size and fixed-interval buffering
schemes in the presence of data size variation. It shows that data size
variation has detrimental effects on the power expenditure of the fixed-size
buffering in general, and reveals that the size variation induced effects can
be either mitigated by a positive skewness or promoted by a negative skewness
in size distribution. By contrast, the fixed-interval buffering scheme has an
obvious advantage of being eminently immune to the data-size variation. Hence
the fixed-interval buffering scheme is a risk-averse strategy for its
robustness in a variety of operational environments. In addition, based on the
fixed-interval buffering scheme, we establish the power consumption
relationship between child nodes and parent node in a static data collection
tree, and give an in-depth analysis of the impact of child bandwidth
distribution on parent's power consumption.
This study is of practical significance: it sheds new light on the
relationship among power consumption of buffering schemes, power parameters of
radio module and memory bank, data arrival rate and data size variation,
thereby providing well-informed guidance in determining an optimal buffer size
(interval) to maximize the operational lifespan of sensor networks
Configurable Strategies for Work-stealing
Work-stealing systems are typically oblivious to the nature of the tasks they
are scheduling. For instance, they do not know or take into account how long a
task will take to execute or how many subtasks it will spawn. Moreover, the
actual task execution order is typically determined by the underlying task
storage data structure, and cannot be changed. There are thus possibilities for
optimizing task parallel executions by providing information on specific tasks
and their preferred execution order to the scheduling system.
We introduce scheduling strategies to enable applications to dynamically
provide hints to the task-scheduling system on the nature of specific tasks.
Scheduling strategies can be used to independently control both local task
execution order as well as steal order. In contrast to conventional scheduling
policies that are normally global in scope, strategies allow the scheduler to
apply optimizations on individual tasks. This flexibility greatly improves
composability as it allows the scheduler to apply different, specific
scheduling choices for different parts of applications simultaneously. We
present a number of benchmarks that highlight diverse, beneficial effects that
can be achieved with scheduling strategies. Some benchmarks (branch-and-bound,
single-source shortest path) show that prioritization of tasks can reduce the
total amount of work compared to standard work-stealing execution order. For
other benchmarks (triangle strip generation) qualitatively better results can
be achieved in shorter time. Other optimizations, such as dynamic merging of
tasks or stealing of half the work, instead of half the tasks, are also shown
to improve performance. Composability is demonstrated by examples that combine
different strategies, both within the same kernel (prefix sum) as well as when
scheduling multiple kernels (prefix sum and unbalanced tree search)
Correlation-Aware Distributed Caching and Coded Delivery
Cache-aided coded multicast leverages side information at wireless edge
caches to efficiently serve multiple groupcast demands via common multicast
transmissions, leading to load reductions that are proportional to the
aggregate cache size. However, the increasingly unpredictable and personalized
nature of the content that users consume challenges the efficiency of existing
caching-based solutions in which only exact content reuse is explored. This
paper generalizes the cache-aided coded multicast problem to a source
compression with distributed side information problem that specifically
accounts for the correlation among the content files. It is shown how joint
file compression during the caching and delivery phases can provide load
reductions that go beyond those achieved with existing schemes. This is
accomplished through a lower bound on the fundamental rate-memory trade-off as
well as a correlation-aware achievable scheme, shown to significantly
outperform state-of-the-art correlation-unaware solutions, while approaching
the limiting rate-memory trade-off.Comment: In proceeding of IEEE Information Theory Workshop (ITW), 201
- …