Search CORE

21 research outputs found

Recommended from our members

Fast Moment Estimation in Data Streams in Optimal Space

Author: Kane Daniel M.
Nelson Jelani
Porat Ely
Woodruff David P.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/01/2015
Field of study

We give a space-optimal streaming algorithm with update time

O(log^2(1/\epsilon)loglog(1/\epsilon))

for approximating the pth frequency moment, 0 < p < 2, of a length-n vector updated in a data stream up to a factor of

1 \pm \epsilon

. This provides a nearly exponential improvement over the previous space optimal algorithm of [Kane-Nelson-Woodruff, SODA 2010], which had update time

\Omega(1/\epsilon^2)

. When combined with the work of [Harvey-Nelson-Onak, FOCS 2008], we also obtain the first algorithm for entropy estimation in turnstile streams which simultaneously achieves near-optimal space and fast update time.Engineering and Applied Science

Harvard University - DASH

Recursive Sketching For Frequency Moments

Author: Braverman Vladimir
Ostrovsky Rafail
Publication venue
Publication date: 11/11/2010
Field of study

In a ground-breaking paper, Indyk and Woodruff (STOC 05) showed how to compute

F_k

(for

k>2

) in space complexity O(\mbox{\em poly-log}(n,m)\cdot n^{1-\frac2k}), which is optimal up to (large) poly-logarithmic factors in

n

and

m

, where

m

is the length of the stream and

n

is the upper bound on the number of distinct elements in a stream. The best known lower bound for large moments is

\Omega(\log(n)n^{1-\frac2k})

. A follow-up work of Bhuvanagiri, Ganguly, Kesh and Saha (SODA 2006) reduced the poly-logarithmic factors of Indyk and Woodruff to

O(\log^2(m)\cdot (\log n+ \log m)\cdot n^{1-{2\over k}})

. Further reduction of poly-log factors has been an elusive goal since 2006, when Indyk and Woodruff method seemed to hit a natural "barrier." Using our simple recursive sketch, we provide a different yet simple approach to obtain a

O(\log(m)\log(nm)\cdot (\log\log n)^4\cdot n^{1-{2\over k}})

algorithm for constant

\epsilon

(our bound is, in fact, somewhat stronger, where the

(\log\log n)

term can be replaced by any constant number of

\log

iterations instead of just two or three, thus approaching

log^*n

. Our bound also works for non-constant

\epsilon

(for details see the body of the paper). Further, our algorithm requires only

4

-wise independence, in contrast to existing methods that use pseudo-random generators for computing large frequency moments

arXiv.org e-Print Archive

CiteSeerX

On the Power of Adaptivity in Sparse Recovery

Author: Indyk Piotr
Price Eric
Woodruff David P.
Publication venue
Publication date: 01/01/2011
Field of study

The goal of (stable) sparse recovery is to recover a

k

-sparse approximation

x*

of a vector

x

from linear measurements of

x

. Specifically, the goal is to recover

x*

such that ||x-x*||_p <= C min_{k-sparse x'} ||x-x'||_q for some constant

C

and norm parameters

p

and

q

. It is known that, for

p=q=1

p=q=2

, this task can be accomplished using

m=O(k \log (n/k))

non-adaptive measurements [CRT06] and that this bound is tight [DIPW10,FPRU10,PW11]. In this paper we show that if one is allowed to perform measurements that are adaptive, then the number of measurements can be considerably reduced. Specifically, for

C=1+eps

and

p=q=2

we show - A scheme with

m=O((1/eps)k log log (n eps/k))

measurements that uses

O(log* k \log \log (n eps/k))

rounds. This is a significant improvement over the best possible non-adaptive bound. - A scheme with

m=O((1/eps) k log (k/eps) + k \log (n/k))

measurements that uses /two/ rounds. This improves over the best possible non-adaptive bound. To the best of our knowledge, these are the first results of this type. As an independent application, we show how to solve the problem of finding a duplicate in a data stream of

n

items drawn from

{1, 2, ..., n-1}

using

O(log n)

bits of space and

O(log log n)

passes, improving over the best possible space complexity achievable using a single pass.Comment: 18 pages; appearing at FOCS 201

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Sparser Johnson-Lindenstrauss Transforms

Author: Kane Daniel M.
Nelson Jelani
Publication venue
Publication date: 01/01/2014
Field of study

We give two different and simple constructions for dimensionality reduction in

\ell_2

via linear mappings that are sparse: only an

O(\varepsilon)

-fraction of entries in each column of our embedding matrices are non-zero to achieve distortion

1+\varepsilon

with high probability, while still achieving the asymptotically optimal number of rows. These are the first constructions to provide subconstant sparsity for all values of parameters, improving upon previous works of Achlioptas (JCSS 2003) and Dasgupta, Kumar, and Sarl\'{o}s (STOC 2010). Such distributions can be used to speed up applications where

\ell_2

dimensionality reduction is used.Comment: v6: journal version, minor changes, added Remark 23; v5: modified abstract, fixed typos, added open problem section; v4: simplified section 4 by giving 1 analysis that covers both constructions; v3: proof of Theorem 25 in v2 was written incorrectly, now fixed; v2: Added another construction achieving same upper bound, and added proof of near-tight lower bound for DKS schem

arXiv.org e-Print Archive

Crossref

Harvard University - DASH

eScholarship - University of California