Search CORE

882 research outputs found

I/O-optimal algorithms on grid graphs

Author: Haverkort Herman
Publication venue
Publication date: 01/01/2012
Field of study

Given a graph of which the n vertices form a regular two-dimensional grid, and in which each (possibly weighted and/or directed) edge connects a vertex to one of its eight neighbours, the following can be done in O(scan(n)) I/Os, provided M = Omega(B^2): computation of shortest paths with non-negative edge weights from a single source, breadth-first traversal, computation of a minimum spanning tree, topological sorting, time-forward processing (if the input is a plane graph), and an Euler tour (if the input graph is a tree). The minimum-spanning tree algorithm is cache-oblivious. The best previously published algorithms for these problems need Theta(sort(n)) I/Os. Estimates of the actual I/O volume show that the new algorithms may often be very efficient in practice.Comment: 12 pages' extended abstract plus 12 pages' appendix with details, proofs and calculations. Has not been published in and is currently not under review of any conference or journa

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Lower Bounds for Oblivious Near-Neighbor Search

Author: Larsen Kasper Green
Malkin Tal
Weinstein Omri
Yeo Kevin
Publication venue
Publication date: 09/04/2019
Field of study

We prove an

\Omega(d \lg n/ (\lg\lg n)^2)

lower bound on the dynamic cell-probe complexity of statistically

\mathit{oblivious}

approximate-near-neighbor search (

\mathsf{ANN}

) over the

d

-dimensional Hamming cube. For the natural setting of

d = \Theta(\log n)

, our result implies an

\tilde{\Omega}(\lg^2 n)

lower bound, which is a quadratic improvement over the highest (non-oblivious) cell-probe lower bound for

\mathsf{ANN}

. This is the first super-logarithmic

\mathit{unconditional}

lower bound for

\mathsf{ANN}

against general (non black-box) data structures. We also show that any oblivious

\mathit{static}

data structure for decomposable search problems (like

\mathsf{ANN}

) can be obliviously dynamized with

O(\log n)

overhead in update and query time, strengthening a classic result of Bentley and Saxe (Algorithmica, 1980).Comment: 28 page

arXiv.org e-Print Archive

Crossref

Cryptology ePrint Archive

RAM-Efficient External Memory Sorting

Author: A. Aggarwal
D. Comer
L. Arge
M. Thorup
R. Fadel
Publication venue
Publication date: 01/01/2013
Field of study

In recent years a large number of problems have been considered in external memory models of computation, where the complexity measure is the number of blocks of data that are moved between slow external memory and fast internal memory (also called I/Os). In practice, however, internal memory time often dominates the total running time once I/O-efficiency has been obtained. In this paper we study algorithms for fundamental problems that are simultaneously I/O-efficient and internal memory efficient in the RAM model of computation.Comment: To appear in Proceedings of ISAAC 2013, getting the Best Paper Awar

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

Lower Bounds for Oblivious Data Structures

Author: Jacob Riko
Larsen Kasper Green
Nielsen Jesper Buus
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 24/10/2018
Field of study

An oblivious data structure is a data structure where the memory access patterns reveals no information about the operations performed on it. Such data structures were introduced by Wang et al. [ACM SIGSAC'14] and are intended for situations where one wishes to store the data structure at an untrusted server. One way to obtain an oblivious data structure is simply to run a classic data structure on an oblivious RAM (ORAM). Until very recently, this resulted in an overhead of

\omega(\lg n)

for the most natural setting of parameters. Moreover, a recent lower bound for ORAMs by Larsen and Nielsen [CRYPTO'18] show that they always incur an overhead of at least

\Omega(\lg n)

if used in a black box manner. To circumvent the

\omega(\lg n)

overhead, researchers have instead studied classic data structure problems more directly and have obtained efficient solutions for many such problems such as stacks, queues, deques, priority queues and search trees. However, none of these data structures process operations faster than

\Theta(\lg n)

, leaving open the question of whether even faster solutions exist. In this paper, we rule out this possibility by proving

\Omega(\lg n)

lower bounds for oblivious stacks, queues, deques, priority queues and search trees.Comment: To appear at SODA'1

arXiv.org e-Print Archive

Crossref

The IT University of Copenhagen's Repository

Configurable Strategies for Work-stealing

Author: Cederman Daniel
Träff Jesper Larsson
Tsigas Philippas
Wimmer Martin
Publication venue
Publication date: 01/01/2013
Field of study

Work-stealing systems are typically oblivious to the nature of the tasks they are scheduling. For instance, they do not know or take into account how long a task will take to execute or how many subtasks it will spawn. Moreover, the actual task execution order is typically determined by the underlying task storage data structure, and cannot be changed. There are thus possibilities for optimizing task parallel executions by providing information on specific tasks and their preferred execution order to the scheduling system. We introduce scheduling strategies to enable applications to dynamically provide hints to the task-scheduling system on the nature of specific tasks. Scheduling strategies can be used to independently control both local task execution order as well as steal order. In contrast to conventional scheduling policies that are normally global in scope, strategies allow the scheduler to apply optimizations on individual tasks. This flexibility greatly improves composability as it allows the scheduler to apply different, specific scheduling choices for different parts of applications simultaneously. We present a number of benchmarks that highlight diverse, beneficial effects that can be achieved with scheduling strategies. Some benchmarks (branch-and-bound, single-source shortest path) show that prioritization of tasks can reduce the total amount of work compared to standard work-stealing execution order. For other benchmarks (triangle strip generation) qualitatively better results can be achieved in shorter time. Other optimizations, such as dynamic merging of tasks or stealing of half the work, instead of half the tasks, are also shown to improve performance. Composability is demonstrated by examples that combine different strategies, both within the same kernel (prefix sum) as well as when scheduling multiple kernels (prefix sum and unbalanced tree search)

arXiv.org e-Print Archive

Chalmers Research

Chalmers Publication Library

Exploiting non-constant safe memory in resilient algorithms and data structures

Author: DE STEFANI LORENZO
SILVESTRI FRANCESCO
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

We extend the Faulty RAM model by Finocchi and Italiano (2008) by adding a safe memory of arbitrary size

S

, and we then derive tradeoffs between the performance of resilient algorithmic techniques and the size of the safe memory. Let

\delta

and

\alpha

denote, respectively, the maximum amount of faults which can happen during the execution of an algorithm and the actual number of occurred faults, with

\alpha \leq \delta

. We propose a resilient algorithm for sorting

n

entries which requires

O\left(n\log n+\alpha (\delta/S + \log S)\right)

time and uses

\Theta(S)

safe memory words. Our algorithm outperforms previous resilient sorting algorithms which do not exploit the available safe memory and require

O\left(n\log n+ \alpha\delta\right)

time. Finally, we exploit our sorting algorithm for deriving a resilient priority queue. Our implementation uses

\Theta(S)

safe memory words and

\Theta(n)

faulty memory words for storing

n

keys, and requires

O\left(\log n + \delta/S\right)

amortized time for each insert and deletemin operation. Our resilient priority queue improves the

O\left(\log n + \delta\right)

amortized time required by the state of the art.Comment: To appear in Theoretical Computer Science, 201

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

OutFlank Routing: Increasing Throughput in Toroidal Interconnection Networks

Author: Versaci Francesco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/10/2013
Field of study

We present a new, deadlock-free, routing scheme for toroidal interconnection networks, called OutFlank Routing (OFR). OFR is an adaptive strategy which exploits non-minimal links, both in the source and in the destination nodes. When minimal links are congested, OFR deroutes packets to carefully chosen intermediate destinations, in order to obtain travel paths which are only an additive constant longer than the shortest ones. Since routing performance is very sensitive to changes in the traffic model or in the router parameters, an accurate discrete-event simulator of the toroidal network has been developed to empirically validate OFR, by comparing it against other relevant routing strategies, over a range of typical real-world traffic patterns. On the 16x16x16 (4096 nodes) simulated network OFR exhibits improvements of the maximum sustained throughput between 14% and 114%, with respect to Adaptive Bubble Routing.Comment: 9 pages, 5 figures, to be presented at ICPADS 201

arXiv.org e-Print Archive

Crossref