Search CORE

18,647 research outputs found

A directed isoperimetric inequality with application to Bregman near neighbor lower bounds

Author: Chaudhuri Kamalika
Marcel
Talagrand Michel
Publication venue
Publication date: 16/05/2015
Field of study

Bregman divergences

D_\phi

are a class of divergences parametrized by a convex function

\phi

and include well known distance functions like

\ell_2^2

and the Kullback-Leibler divergence. There has been extensive research on algorithms for problems like clustering and near neighbor search with respect to Bregman divergences, in all cases, the algorithms depend not just on the data size

n

and dimensionality

d

, but also on a structure constant

\mu \ge 1

that depends solely on

\phi

and can grow without bound independently. In this paper, we provide the first evidence that this dependence on

\mu

might be intrinsic. We focus on the problem of approximate near neighbor search for Bregman divergences. We show that under the cell probe model, any non-adaptive data structure (like locality-sensitive hashing) for

c

-approximate near-neighbor search that admits

r

probes must use space

\Omega(n^{1 + \frac{\mu}{c r}})

. In contrast, for LSH under

\ell_1

the best bound is

\Omega(n^{1+\frac{1}{cr}})

. Our new tool is a directed variant of the standard boolean noise operator. We show that a generalization of the Bonami-Beckner hypercontractivity inequality exists "in expectation" or upon restriction to certain subsets of the Hamming cube, and that this is sufficient to prove the desired isoperimetric inequality that we use in our data structure lower bound. We also present a structural result reducing the Hamming cube to a Bregman cube. This structure allows us to obtain lower bounds for problems under Bregman divergences from their

\ell_1

analog. In particular, we get a (weaker) lower bound for approximate near neighbor search of the form

\Omega(n^{1 + \frac{1}{cr}})

for an

r

-query non-adaptive data structure, and new cell probe lower bounds for a number of other near neighbor questions in Bregman space.Comment: 27 page

arXiv.org e-Print Archive

Crossref

Dynamic Integer Sets with Optimal Rank, Select, and Predecessor Search

Author: Patrascu Mihai
Thorup Mikkel
Publication venue
Publication date: 01/01/2014
Field of study

We present a data structure representing a dynamic set S of w-bit integers on a w-bit word RAM. With |S|=n and w > log n and space O(n), we support the following standard operations in O(log n / log w) time: - insert(x) sets S = S + {x}. - delete(x) sets S = S - {x}. - predecessor(x) returns max{y in S | y= x}. - rank(x) returns #{y in S | y< x}. - select(i) returns y in S with rank(y)=i, if any. Our O(log n/log w) bound is optimal for dynamic rank and select, matching a lower bound of Fredman and Saks [STOC'89]. When the word length is large, our time bound is also optimal for dynamic predecessor, matching a static lower bound of Beame and Fich [STOC'99] whenever log n/log w=O(log w/loglog w). Technically, the most interesting aspect of our data structure is that it supports all the above operations in constant time for sets of size n=w^{O(1)}. This resolves a main open problem of Ajtai, Komlos, and Fredman [FOCS'83]. Ajtai et al. presented such a data structure in Yao's abstract cell-probe model with w-bit cells/words, but pointed out that the functions used could not be implemented. As a partial solution to the problem, Fredman and Willard [STOC'90] introduced a fusion node that could handle queries in constant time, but used polynomial time on the updates. We call our small set data structure a dynamic fusion node as it does both queries and updates in constant time.Comment: Presented with different formatting in Proceedings of the 55nd IEEE Symposium on Foundations of Computer Science (FOCS), 2014, pp. 166--175. The new version fixes a bug in one of the bounds stated for predecessor search, pointed out to me by Djamal Belazzougu

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

Cell-probe Lower Bounds for Dynamic Problems via a New Communication Model

Author: Clifford Raphaël
Pˇatra¸scu Mihai
Pˇatra¸scu Mihai
Publication venue
Publication date: 03/12/2015
Field of study

In this paper, we develop a new communication model to prove a data structure lower bound for the dynamic interval union problem. The problem is to maintain a multiset of intervals

\mathcal{I}

over

[0, n]

with integer coordinates, supporting the following operations: - insert(a, b): add an interval

[a, b]

\mathcal{I}

, provided that

a

and

b

are integers in

[0, n]

; - delete(a, b): delete a (previously inserted) interval

[a, b]

from

\mathcal{I}

; - query(): return the total length of the union of all intervals in

\mathcal{I}

. It is related to the two-dimensional case of Klee's measure problem. We prove that there is a distribution over sequences of operations with

O(n)

insertions and deletions, and

O(n^{0.01})

queries, for which any data structure with any constant error probability requires

\Omega(n\log n)

time in expectation. Interestingly, we use the sparse set disjointness protocol of H\aa{}stad and Wigderson [ToC'07] to speed up a reduction from a new kind of nondeterministic communication games, for which we prove lower bounds. For applications, we prove lower bounds for several dynamic graph problems by reducing them from dynamic interval union

arXiv.org e-Print Archive

Crossref