76 research outputs found

    Finding needles in noisy haystacks

    Full text link

    On the Power of Adaptivity in Sparse Recovery

    Get PDF
    The goal of (stable) sparse recovery is to recover a kk-sparse approximation x∗x* of a vector xx from linear measurements of xx. Specifically, the goal is to recover x∗x* such that ||x-x*||_p <= C min_{k-sparse x'} ||x-x'||_q for some constant CC and norm parameters pp and qq. It is known that, for p=q=1p=q=1 or p=q=2p=q=2, this task can be accomplished using m=O(klog⁡(n/k))m=O(k \log (n/k)) non-adaptive measurements [CRT06] and that this bound is tight [DIPW10,FPRU10,PW11]. In this paper we show that if one is allowed to perform measurements that are adaptive, then the number of measurements can be considerably reduced. Specifically, for C=1+epsC=1+eps and p=q=2p=q=2 we show - A scheme with m=O((1/eps)kloglog(neps/k))m=O((1/eps)k log log (n eps/k)) measurements that uses O(log∗klog⁡log⁡(neps/k))O(log* k \log \log (n eps/k)) rounds. This is a significant improvement over the best possible non-adaptive bound. - A scheme with m=O((1/eps)klog(k/eps)+klog⁡(n/k))m=O((1/eps) k log (k/eps) + k \log (n/k)) measurements that uses /two/ rounds. This improves over the best possible non-adaptive bound. To the best of our knowledge, these are the first results of this type. As an independent application, we show how to solve the problem of finding a duplicate in a data stream of nn items drawn from 1,2,...,n−1{1, 2, ..., n-1} using O(logn)O(log n) bits of space and O(loglogn)O(log log n) passes, improving over the best possible space complexity achievable using a single pass.Comment: 18 pages; appearing at FOCS 201

    Mining the ‘Internet Graveyard’: Rethinking the Historians’ Toolkit

    Get PDF
    “Mining the Internet Graveyard” argues that the advent of massive quantity of born-digital historical sources necessitates a rethinking of the historians’ toolkit. The contours of a third wave of computational history are outlined, a trend marked by ever-increasing amounts of digitized information (especially web based), falling digital storage costs, a move to the cloud, and a corresponding increase in computational power to process these sources. Following this, the article uses a case study of an early born-digital archive at Library and Archives Canada – Canada’s Digital Collections project (CDC) – to bring some of these problems into view. An array of off-the-shelf data analysis solutions, coupled with code written in Mathematica, helps us bring context and retrieve information from a digital collection on a previously inaccessible scale. The article concludes with an illustration of the various computational tools available, as well as a call for greater digital literacy in history curricula and professional development.Social Sciences and Humanities Research Council || 430-2013-061
    • 

    corecore