Search CORE

76 research outputs found

Finding needles in noisy haystacks

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

On the Power of Adaptivity in Sparse Recovery

Author: Indyk Piotr
Price Eric
Woodruff David P.
Publication venue
Publication date: 01/01/2011
Field of study

The goal of (stable) sparse recovery is to recover a

k

-sparse approximation

x*

of a vector

x

from linear measurements of

x

. Specifically, the goal is to recover

x*

such that ||x-x*||_p <= C min_{k-sparse x'} ||x-x'||_q for some constant

C

and norm parameters

p

and

q

. It is known that, for

p=q=1

p=q=2

, this task can be accomplished using

m=O(k \log (n/k))

non-adaptive measurements [CRT06] and that this bound is tight [DIPW10,FPRU10,PW11]. In this paper we show that if one is allowed to perform measurements that are adaptive, then the number of measurements can be considerably reduced. Specifically, for

C=1+eps

and

p=q=2

we show - A scheme with

m=O((1/eps)k log log (n eps/k))

measurements that uses

O(log* k \log \log (n eps/k))

rounds. This is a significant improvement over the best possible non-adaptive bound. - A scheme with

m=O((1/eps) k log (k/eps) + k \log (n/k))

measurements that uses /two/ rounds. This improves over the best possible non-adaptive bound. To the best of our knowledge, these are the first results of this type. As an independent application, we show how to solve the problem of finding a duplicate in a data stream of

n

items drawn from

{1, 2, ..., n-1}

using

O(log n)

bits of space and

O(log log n)

passes, improving over the best possible space complexity achievable using a single pass.Comment: 18 pages; appearing at FOCS 201

arXiv.org e-Print Archive

CiteSeerX

Mining the ‘Internet Graveyard’: Rethinking the Historians’ Toolkit

Author: Milligan Ian
Publication venue: 'Consortium Erudit'
Publication date: 01/01/2012
Field of study

“Mining the Internet Graveyard” argues that the advent of massive quantity of born-digital historical sources necessitates a rethinking of the historians’ toolkit. The contours of a third wave of computational history are outlined, a trend marked by ever-increasing amounts of digitized information (especially web based), falling digital storage costs, a move to the cloud, and a corresponding increase in computational power to process these sources. Following this, the article uses a case study of an early born-digital archive at Library and Archives Canada – Canada’s Digital Collections project (CDC) – to bring some of these problems into view. An array of off-the-shelf data analysis solutions, coupled with code written in Mathematica, helps us bring context and retrieve information from a digital collection on a previously inaccessible scale. The article concludes with an illustration of the various computational tools available, as well as a call for greater digital literacy in history curricula and professional development.Social Sciences and Humanities Research Council || 430-2013-061

University of Waterloo's Institutional Repository

Érudit