2,020 research outputs found

    Optimal Hierarchical Layouts for Cache-Oblivious Search Trees

    Full text link
    This paper proposes a general framework for generating cache-oblivious layouts for binary search trees. A cache-oblivious layout attempts to minimize cache misses on any hierarchical memory, independent of the number of memory levels and attributes at each level such as cache size, line size, and replacement policy. Recursively partitioning a tree into contiguous subtrees and prescribing an ordering amongst the subtrees, Hierarchical Layouts generalize many commonly used layouts for trees such as in-order, pre-order and breadth-first. They also generalize the various flavors of the van Emde Boas layout, which have previously been used as cache-oblivious layouts. Hierarchical Layouts thus unify all previous attempts at deriving layouts for search trees. The paper then derives a new locality measure (the Weighted Edge Product) that mimics the probability of cache misses at multiple levels, and shows that layouts that reduce this measure perform better. We analyze the various degrees of freedom in the construction of Hierarchical Layouts, and investigate the relative effect of each of these decisions in the construction of cache-oblivious layouts. Optimizing the Weighted Edge Product for complete binary search trees, we introduce the MinWEP layout, and show that it outperforms previously used cache-oblivious layouts by almost 20%.Comment: Extended version with proofs added to the appendi

    I/O-optimal algorithms on grid graphs

    Full text link
    Given a graph of which the n vertices form a regular two-dimensional grid, and in which each (possibly weighted and/or directed) edge connects a vertex to one of its eight neighbours, the following can be done in O(scan(n)) I/Os, provided M = Omega(B^2): computation of shortest paths with non-negative edge weights from a single source, breadth-first traversal, computation of a minimum spanning tree, topological sorting, time-forward processing (if the input is a plane graph), and an Euler tour (if the input graph is a tree). The minimum-spanning tree algorithm is cache-oblivious. The best previously published algorithms for these problems need Theta(sort(n)) I/Os. Estimates of the actual I/O volume show that the new algorithms may often be very efficient in practice.Comment: 12 pages' extended abstract plus 12 pages' appendix with details, proofs and calculations. Has not been published in and is currently not under review of any conference or journa

    Parallel Minimum Cuts in Near-linear Work and Low Depth

    Full text link
    We present the first near-linear work and poly-logarithmic depth algorithm for computing a minimum cut in a graph, while previous parallel algorithms with poly-logarithmic depth required at least quadratic work in the number of vertices. In a graph with nn vertices and mm edges, our algorithm computes the correct result with high probability in O(mlog4n)O(m {\log}^4 n) work and O(log3n)O({\log}^3 n) depth. This result is obtained by parallelizing a data structure that aggregates weights along paths in a tree and by exploiting the connection between minimum cuts and approximate maximum packings of spanning trees. In addition, our algorithm improves upon bounds on the number of cache misses incurred to compute a minimum cut

    04091 Abstracts Collection -- Data Structures

    Get PDF
    From 22.02. to 27.02.2004, Dagstuhl Seminar "Data Structures" was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar are put together in this paper. The first section describes the seminar topics and goals in general

    Analysis of Power-aware Buffering Schemes in Wireless Sensor Networks

    Full text link
    We study the power-aware buffering problem in battery-powered sensor networks, focusing on the fixed-size and fixed-interval buffering schemes. The main motivation is to address the yet poorly understood size variation-induced effect on power-aware buffering schemes. Our theoretical analysis elucidates the fundamental differences between the fixed-size and fixed-interval buffering schemes in the presence of data size variation. It shows that data size variation has detrimental effects on the power expenditure of the fixed-size buffering in general, and reveals that the size variation induced effects can be either mitigated by a positive skewness or promoted by a negative skewness in size distribution. By contrast, the fixed-interval buffering scheme has an obvious advantage of being eminently immune to the data-size variation. Hence the fixed-interval buffering scheme is a risk-averse strategy for its robustness in a variety of operational environments. In addition, based on the fixed-interval buffering scheme, we establish the power consumption relationship between child nodes and parent node in a static data collection tree, and give an in-depth analysis of the impact of child bandwidth distribution on parent's power consumption. This study is of practical significance: it sheds new light on the relationship among power consumption of buffering schemes, power parameters of radio module and memory bank, data arrival rate and data size variation, thereby providing well-informed guidance in determining an optimal buffer size (interval) to maximize the operational lifespan of sensor networks

    Configurable Strategies for Work-stealing

    Full text link
    Work-stealing systems are typically oblivious to the nature of the tasks they are scheduling. For instance, they do not know or take into account how long a task will take to execute or how many subtasks it will spawn. Moreover, the actual task execution order is typically determined by the underlying task storage data structure, and cannot be changed. There are thus possibilities for optimizing task parallel executions by providing information on specific tasks and their preferred execution order to the scheduling system. We introduce scheduling strategies to enable applications to dynamically provide hints to the task-scheduling system on the nature of specific tasks. Scheduling strategies can be used to independently control both local task execution order as well as steal order. In contrast to conventional scheduling policies that are normally global in scope, strategies allow the scheduler to apply optimizations on individual tasks. This flexibility greatly improves composability as it allows the scheduler to apply different, specific scheduling choices for different parts of applications simultaneously. We present a number of benchmarks that highlight diverse, beneficial effects that can be achieved with scheduling strategies. Some benchmarks (branch-and-bound, single-source shortest path) show that prioritization of tasks can reduce the total amount of work compared to standard work-stealing execution order. For other benchmarks (triangle strip generation) qualitatively better results can be achieved in shorter time. Other optimizations, such as dynamic merging of tasks or stealing of half the work, instead of half the tasks, are also shown to improve performance. Composability is demonstrated by examples that combine different strategies, both within the same kernel (prefix sum) as well as when scheduling multiple kernels (prefix sum and unbalanced tree search)

    Correlation-Aware Distributed Caching and Coded Delivery

    Full text link
    Cache-aided coded multicast leverages side information at wireless edge caches to efficiently serve multiple groupcast demands via common multicast transmissions, leading to load reductions that are proportional to the aggregate cache size. However, the increasingly unpredictable and personalized nature of the content that users consume challenges the efficiency of existing caching-based solutions in which only exact content reuse is explored. This paper generalizes the cache-aided coded multicast problem to a source compression with distributed side information problem that specifically accounts for the correlation among the content files. It is shown how joint file compression during the caching and delivery phases can provide load reductions that go beyond those achieved with existing schemes. This is accomplished through a lower bound on the fundamental rate-memory trade-off as well as a correlation-aware achievable scheme, shown to significantly outperform state-of-the-art correlation-unaware solutions, while approaching the limiting rate-memory trade-off.Comment: In proceeding of IEEE Information Theory Workshop (ITW), 201
    corecore