Search CORE

182 research outputs found

Approximate Range Emptiness in Constant Time and Optimal Space

Author: Goswami M.
Grønlund A.
Larsen K.
Pagh R.
Publication venue
Publication date: 10/07/2014
Field of study

This paper studies the \emph{

\varepsilon

-approximate range emptiness} problem, where the task is to represent a set

S

n

points from

\{0,\ldots,U-1\}

and answer emptiness queries of the form "

[a ; b]\cap S \neq \emptyset

?" with a probability of \emph{false positives} allowed. This generalizes the functionality of \emph{Bloom filters} from single point queries to any interval length

L

. Setting the false positive rate to

\varepsilon/L

and performing

L

queries, Bloom filters yield a solution to this problem with space

O(n \lg(L/\varepsilon))

bits, false positive probability bounded by

\varepsilon

for intervals of length up to

L

, using query time

O(L \lg(L/\varepsilon))

. Our first contribution is to show that the space/error trade-off cannot be improved asymptotically: Any data structure for answering approximate range emptiness queries on intervals of length up to

L

with false positive probability

\varepsilon

, must use space

\Omega(n \lg(L/\varepsilon)) - O(n)

bits. On the positive side we show that the query time can be improved greatly, to constant time, while matching our space lower bound up to a lower order additive term. This result is achieved through a succinct data structure for (non-approximate 1d) range emptiness/reporting queries, which may be of independent interest

MPG.PuRe

Triangle Counting in Dynamic Graph Streams

Author: A Pagh
A Pavan
CE Tsourakakis
I Kremer
JW Berry
Konstantin Kutzkov
L Becchetti
Laurent Bulteau
LJ Carter
M Pǎtraşcu
M Thorup
MN Kolountzakis
N Alon
R Albert
R Pagh
Rasmus Pagh
S Muthukrishnan
Vincent Froese
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/07/2015
Field of study

Estimating the number of triangles in graph streams using a limited amount of memory has become a popular topic in the last decade. Different variations of the problem have been studied, depending on whether the graph edges are provided in an arbitrary order or as incidence lists. However, with a few exceptions, the algorithms have considered {\em insert-only} streams. We present a new algorithm estimating the number of triangles in {\em dynamic} graph streams where edges can be both inserted and deleted. We show that our algorithm achieves better time and space complexity than previous solutions for various graph classes, for example sparse graphs with a relatively small number of triangles. Also, for graphs with constant transitivity coefficient, a common situation in real graphs, this is the first algorithm achieving constant processing time per edge. The result is achieved by a novel approach combining sampling of vertex triples and sparsification of the input graph. In the course of the analysis of the algorithm we present a lower bound on the number of pairwise independent 2-paths in general graphs which might be of independent interest. At the end of the paper we discuss lower bounds on the space complexity of triangle counting algorithms that make no assumptions on the structure of the graph.Comment: New version of a SWAT 2014 paper with improved result

arXiv.org e-Print Archive

Crossref

The IT University of Copenhagen's Repository

Efficient Dynamic Approximate Distance Oracles for Vertex-Labeled Planar Graphs

Author: BT Wilkinson
DE Willard
M Li
M Thorup
ML Fredman
MR Henzinger
Q-P Gu
R Pagh
RJ Lipton
S Mozes
Publication venue
Publication date: 27/08/2017
Field of study

Let

G

be a graph where each vertex is associated with a label. A Vertex-Labeled Approximate Distance Oracle is a data structure that, given a vertex

v

and a label

\lambda

, returns a

(1+\varepsilon)

-approximation of the distance from

v

to the closest vertex with label

\lambda

G

. Such an oracle is dynamic if it also supports label changes. In this paper we present three different dynamic approximate vertex-labeled distance oracles for planar graphs, all with polylogarithmic query and update times, and nearly linear space requirements

arXiv.org e-Print Archive

Crossref

On Counting Triangles through Edge Sampling in Large Dynamic Graphs

Author: D Chakrabarti
DG Horvitz
H Avron
HT Welser
JW Berry
L Becchetti
M Jha
M Latapy
M Thorup
ME Newman
MN Kolountzakis
N Alon
N Chiba
R Gemulla
R Pagh
S Wasserman
T Tiropanis
Publication venue
Publication date: 28/02/2018
Field of study

Traditional frameworks for dynamic graphs have relied on processing only the stream of edges added into or deleted from an evolving graph, but not any additional related information such as the degrees or neighbor lists of nodes incident to the edges. In this paper, we propose a new edge sampling framework for big-graph analytics in dynamic graphs which enhances the traditional model by enabling the use of additional related information. To demonstrate the advantages of this framework, we present a new sampling algorithm, called Edge Sample and Discard (ESD). It generates an unbiased estimate of the total number of triangles, which can be continuously updated in response to both edge additions and deletions. We provide a comparative analysis of the performance of ESD against two current state-of-the-art algorithms in terms of accuracy and complexity. The results of the experiments performed on real graphs show that, with the help of the neighborhood information of the sampled edges, the accuracy achieved by our algorithm is substantially better. We also characterize the impact of properties of the graph on the performance of our algorithm by testing on several Barabasi-Albert graphs.Comment: A short version of this article appeared in Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2017

arXiv.org e-Print Archive

Crossref

Wear Minimization for Cuckoo Hashing: How Not to Throw a Lot of Eggs into One Basket

Author: A. Ben-Aroya
A. Kirsch
A.M. Frieze
A.M. Frieze
D. Fotakis
E. Lehman
H.-S.P. Wong
J. Schmidt-Pruzan
L. Devroye
M. Dietzfelbinger
M. Karoński
P. Pavan
R. Bez
R. Pagh
S. Irani
Y. Arbitman
Y. Azar
Y.-H. Chang
Publication venue
Publication date: 01/01/2014
Field of study

We study wear-leveling techniques for cuckoo hashing, showing that it is possible to achieve a memory wear bound of

\log\log n+O(1)

after the insertion of

n

items into a table of size

Cn

for a suitable constant

C

using cuckoo hashing. Moreover, we study our cuckoo hashing method empirically, showing that it significantly improves on the memory wear performance for classic cuckoo hashing and linear probing in practice.Comment: 13 pages, 1 table, 7 figures; to appear at the 13th Symposium on Experimental Algorithms (SEA 2014

arXiv.org e-Print Archive

Crossref

Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution

Author: A Geiger
D Stamoulis
H Cai
K Wang
PS Wang
PS Wang
R Pagh
Y Wang
Y Yan
Publication venue
Publication date: 13/08/2020
Field of study

Self-driving cars need to understand 3D scenes efficiently and accurately in order to drive safely. Given the limited hardware resources, existing 3D perception models are not able to recognize small instances (e.g., pedestrians, cyclists) very well due to the low-resolution voxelization and aggressive downsampling. To this end, we propose Sparse Point-Voxel Convolution (SPVConv), a lightweight 3D module that equips the vanilla Sparse Convolution with the high-resolution point-based branch. With negligible overhead, this point-based branch is able to preserve the fine details even from large outdoor scenes. To explore the spectrum of efficient 3D models, we first define a flexible architecture design space based on SPVConv, and we then present 3D Neural Architecture Search (3D-NAS) to search the optimal network architecture over this diverse design space efficiently and effectively. Experimental results validate that the resulting SPVNAS model is fast and accurate: it outperforms the state-of-the-art MinkowskiNet by 3.3%, ranking 1st on the competitive SemanticKITTI leaderboard. It also achieves 8x computation reduction and 3x measured speedup over MinkowskiNet with higher accuracy. Finally, we transfer our method to 3D object detection, and it achieves consistent improvements over the one-stage detection baseline on KITTI.Comment: ECCV 2020. The first two authors contributed equally to this work. Project page: http://spvnas.mit.edu

arXiv.org e-Print Archive

Crossref

Dynamic Compressed Strings with Random Access

Author: A. Brodnik
G. Manzini
J. Barbay
J. Jansson
M. Dietzfelbinger
M. Pǎtraşcu
P. Ferragina
P. Ferragina
R. González
R. Grossi
R. Grossi
R. Pagh
R. Pagh
R. Raman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

We consider the problem of storing a string S in dynamic compressed form, while permitting operations directly on the compressed representation of S: access a substring of S; replace, insert or delete a symbol in S; count how many occurrences of a given symbol appear in any given prefix of S (called rank operation) and locate the position of the ith occurrence of a symbol inside S (called select operation). We discuss the time complexity of several combinations of these operations along with the entropy space bounds of the corresponding compressed indexes. In this way, we extend or improve the bounds of previous work by Ferragina and Venturini [TCS, 2007], Jansson et al. [ICALP, 2012], and Nekrich and Navarro [SODA, 2013]

Crossref

Archivio della Ricerca - Università di Pisa

Leicester Research Archive

Efficiently Correcting Matrix Products

Author: A De Bonis
A Lingas
A Schönhage
Andrzej Lingas
Christos Levcopoulos
D Coppersmith
DG Cantor
DZ Du
J Naor
JL Carter
Leszek Gąsieniec
MA Iwen
P Wu
R Pagh
Rasmus Pagh
RM McConnell
T Kimbrel
Takeshi Tokuyama
V Strassen
X Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

We study the problem of efficiently correcting an erroneous product of two

n\times n

matrices over a ring. Among other things, we provide a randomized algorithm for correcting a matrix product with at most

k

erroneous entries running in

\tilde{O}(n^2+kn)

time and a deterministic

\tilde{O}(kn^2)

-time algorithm for this problem (where the notation

\tilde{O}

suppresses polylogarithmic terms in

n

and

k

).Comment: Fixed invalid reference to figure in v

arXiv.org e-Print Archive

University of Liverpool Repository

Lund University Publications

Crossref

Springer - Publisher Connector

The IT University of Copenhagen's Repository

Sub-logarithmic Distributed Oblivious RAM with Small Block Size

Author: A Kirsch
B Chor
B Pinkas
C Gentry
D Apon
E Boyle
E Shi
I Abraham
KG Larsen
L Ren
MT Goodrich
N Gilboa
O Barkol
O Goldreich
R Pagh
S Devadas
S Lu
Z Dvir
Publication venue
Publication date: 17/11/2018
Field of study

Oblivious RAM (ORAM) is a cryptographic primitive that allows a client to securely execute RAM programs over data that is stored in an untrusted server. Distributed Oblivious RAM is a variant of ORAM, where the data is stored in

m>1

servers. Extensive research over the last few decades have succeeded to reduce the bandwidth overhead of ORAM schemes, both in the single-server and the multi-server setting, from

O(\sqrt{N})

O(1)

. However, all known protocols that achieve a sub-logarithmic overhead either require heavy server-side computation (e.g. homomorphic encryption), or a large block size of at least

\Omega(\log^3 N)

. In this paper, we present a family of distributed ORAM constructions that follow the hierarchical approach of Goldreich and Ostrovsky [GO96]. We enhance known techniques, and develop new ones, to take better advantage of the existence of multiple servers. By plugging efficient known hashing schemes in our constructions, we get the following results: 1. For any

m\geq 2

, we show an

m

-server ORAM scheme with

O(\log N/\log\log N)

overhead, and block size

\Omega(\log^2 N)

. This scheme is private even against an

(m-1)

-server collusion. 2. A 3-server ORAM construction with

O(\omega(1)\log N/\log\log N)

overhead and a block size almost logarithmic, i.e.

\Omega(\log^{1+\epsilon}N)

. We also investigate a model where the servers are allowed to perform a linear amount of light local computations, and show that constant overhead is achievable in this model, through a simple four-server ORAM protocol

arXiv.org e-Print Archive

Crossref

Cryptology ePrint Archive

Interactive Learning for Multimedia at Large

Author: A Andoni
A Babenko
A Kovashka
BÞ Jónsson
BÞ Jónsson
C Snoek
GÞ Gudmundsson
H Jégou
H Lejsek
I Mironică
J Lokoč
J Zahálka
M Lan
O Russakovsky
R Basri
R Pagh
RR Curtin
S Vijayanarasimhan
T Ge
T Huang
X Xu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

International audienceInteractive learning has been suggested as a key method for addressing analytic multimedia tasks arising in several domains. Until recently, however, methods to maintain interactive performance at the scale of today's media collections have not been addressed. We propose an interactive learning approach that builds on and extends the state of the art in user relevance feedback systems and high-dimensional indexing for multimedia. We report on a detailed experimental study using the ImageNet and YFCC100M collections, containing 14 million and 100 million images respectively. The proposed approach outperforms the relevant state-of-the-art approaches in terms of interactive performance, while improving suggestion relevance in some cases. In particular, even on YFCC100M, our approach requires less than 0.3 s per interaction round to generate suggestions, using a single computing core and less than 7 GB of main memory

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

The IT University of Copenhagen's Repository

International Migration, Integration and Social Cohesion online publications

UvA-DARE

HAL-Rennes 1