Search CORE

11 research outputs found

On Deterministic Sketching and Streaming for Sparse Recovery and Norm Estimation

Author: Nelson Jelani
Nguyẽn Huy L.
Woodruff David P.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

We study classic streaming and sparse recovery problems using deterministic linear sketches, including

\ell_1/\ell_1

and

\ell_{\infty}/\ell_1

sparse recovery problems (the latter also being known as ℓ1ℓ1-heavy hitters), norm estimation, and approximate inner product. We focus on devising a fixed matrix

A \epsilon \mathbb{R}^{m \times n}

and a deterministic recovery/estimation procedure which work for all possible input vectors simultaneously. Our results improve upon existing work, the following being our main contributions: • A proof that

\ell_{\infty}/\ell_1

sparse recovery and inner product estimation are equivalent, and that incoherent matrices can be used to solve both problems. Our upper bound for the number of measurements is

m=O(\varepsilon^{-2}min\{log n,(log n/log(1/\varepsilon))^2\})

. We can also obtain fast sketching and recovery algorithms by making use of the Fast Johnson–Lindenstrauss transform. Both our running times and number of measurements improve upon previous work. We can also obtain better error guarantees than previous work in terms of a smaller tail of the input vector. • A new lower bound for the number of linear measurements required to solve

\ell_1/\ell_1

sparse recovery. We show

\Omega(k/\varepsilon^2+k log(n/k)/\varepsilon)

measurements are required to recover an x′ with

‖x-x′‖_1\leq(1+\varepsilon)‖x_{tail(k)}‖_1

, where

x_{tail(k)}

is x projected onto all but its largest k coordinates in magnitude. • A tight bound of

m=\theta(\varepsilon^{-2}log(\varepsilon^2n))

on the number of measurements required to solve deterministic norm estimation, i.e., to recover

‖x‖_2\pm\varepsilon‖x‖_1

. For all the problems we study, tight bounds are already known for the randomized complexity from previous work, except in the case of

\ell_1/\ell_1

sparse recovery, where a nearly tight bound is known. Our work thus aims to study the deterministic complexities of these problems. We remark that some of the matrices used in our algorithms, although known to exist, currently are not yet explicit in the sense that deterministic polynomial time constructions are not yet known, although in all cases polynomial time Monte Carlo algorithms are known.Engineering and Applied Science

CiteSeerX

Harvard University - DASH

Recommended from our members

Fast Moment Estimation in Data Streams in Optimal Space

Author: Kane Daniel M.
Nelson Jelani
Porat Ely
Woodruff David P.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/01/2015
Field of study

We give a space-optimal streaming algorithm with update time

O(log^2(1/\epsilon)loglog(1/\epsilon))

for approximating the pth frequency moment, 0 < p < 2, of a length-n vector updated in a data stream up to a factor of

1 \pm \epsilon

. This provides a nearly exponential improvement over the previous space optimal algorithm of [Kane-Nelson-Woodruff, SODA 2010], which had update time

\Omega(1/\epsilon^2)

. When combined with the work of [Harvey-Nelson-Onak, FOCS 2008], we also obtain the first algorithm for entropy estimation in turnstile streams which simultaneously achieves near-optimal space and fast update time.Engineering and Applied Science

Harvard University - DASH

The Power Of Locality In Network Algorithms

Author: Brautbar Michael
Publication venue: ScholarlyCommons
Publication date: 01/01/2013
Field of study

Over the last decade we have witnessed the rapid proliferation of large-scale complex networks, spanning many social, information and technological domains. While many of the tasks which users of such networks face are essentially global and involve the network as a whole, the size of these networks is huge and the information available to users is only local. In this dissertation we show that even when faced with stringent locality constraints, one can still effectively solve prominent algorithmic problems on such networks. In the first part of the dissertation we present a natural algorithmic framework designed to model the behaviour of an external agent trying to solve a network optimization problem with limited access to the network data. Our study focuses on local information algorithms --- sequential algorithms where the network topology is initially unknown and is revealed only within a local neighborhood of vertices that have been irrevocably added to the output set. We address both network coverage problems as well as network search problems. Our results include local information algorithms for coverage problems whose performance closely match the best possible even when information about network structure is unrestricted. We also demonstrate a sharp threshold on the level of visibility required: at a certain visibility level it is possible to design algorithms that nearly match the best approximation possible even with full access to the network structure, but with any less information it is impossible to achieve a reasonable approximation. For preferential attachment networks, we obtain polylogarithmic approximations to the problem of finding the smallest subgraph that connects a subset of nodes and the problem of finding the highest-degree nodes. This is achieved by addressing a decade-old open question of BollobÃ¡s and Riordan on locally finding the root in a preferential attachment process. In the second part of the dissertation we focus on designing highly time efficient local algorithms for central mining problems on complex networks that have been in the focus of the research community over a decade: finding a small set of influential nodes in the network, and fast ranking of nodes. Among our results is an essentially runtime-optimal local algorithm for the influence maximization problem in the standard independent cascades model of information diffusion and an essentially runtime-optimal local algorithm for the problem of returning all nodes with PageRank bigger than a given threshold. Our work demonstrates that locality is powerful enough to allow efficient solutions to many central algorithmic problems on complex networks

CiteSeerX

ScholarlyCommons@Penn

Randomized Primitives for Big Data Processing

Author: Stöckel Morten
Publication venue: IT-Universitetet i København
Publication date: 01/01/2015
Field of study

The IT University of Copenhagen's Repository

Proceedings of the 26th International Symposium on Theoretical Aspects of Computer Science (STACS'09)

Author: Albers Susanne
Marion Jean-Yves
Publication venue
Publication date: 01/01/2009
Field of study

The Symposium on Theoretical Aspects of Computer Science (STACS) is held alternately in France and in Germany. The conference of February 26-28, 2009, held in Freiburg, is the 26th in this series. Previous meetings took place in Paris (1984), Saarbr¨ucken (1985), Orsay (1986), Passau (1987), Bordeaux (1988), Paderborn (1989), Rouen (1990), Hamburg (1991), Cachan (1992), W¨urzburg (1993), Caen (1994), M¨unchen (1995), Grenoble (1996), L¨ubeck (1997), Paris (1998), Trier (1999), Lille (2000), Dresden (2001), Antibes (2002), Berlin (2003), Montpellier (2004), Stuttgart (2005), Marseille (2006), Aachen (2007), and Bordeaux (2008). ..

Hochschulschriftenserver - Universität Frankfurt am Main

Sparse recovery and Fourier sampling

Author: Price Eric C
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2013
Field of study

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 155-160).In the last decade a broad literature has arisen studying sparse recovery, the estimation of sparse vectors from low dimensional linear projections. Sparse recovery has a wide variety of applications such as streaming algorithms, image acquisition, and disease testing. A particularly important subclass of sparse recovery is the sparse Fourier transform, which considers the computation of a discrete Fourier transform when the output is sparse. Applications of the sparse Fourier transform include medical imaging, spectrum sensing, and purely computation tasks involving convolution. This thesis describes a coherent set of techniques that achieve optimal or near-optimal upper and lower bounds for a variety of sparse recovery problems. We give the following state-of-the-art algorithms for recovery of an approximately k-sparse vector in n dimensions: -- Two sparse Fourier transform algorithms, respectively taking ... time and ... samples. The latter is within log e log n of the optimal sample complexity when ... -- An algorithm for adaptive sparse recovery using ... measurements, showing that adaptivity can give substantial improvements when k is small. -- An algorithm for C-approximate sparse recovery with ... measurements, which matches our lower bound up to the log* k factor and gives the first improvement for ... In the second part of this thesis, we give lower bounds for the above problems and more.by Eric Price.Ph. D

DSpace@MIT

27th Annual European Symposium on Algorithms: ESA 2019, September 9-11, 2019, Munich/Garching, Germany

Author: ESA <27. 2019, München>
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing
Publication date: 01/09/2019
Field of study

Digitale Bibliothek Thüringen

A Randomized Algorithm for Finding Frequent Elements in Streams Using O(loglogN) Space

Author: G. Cormode
H. Liu
M. Charikar
M. Fischer
N. Alon
P. Flajolet
R. Morris
R.M. Karp
R.S. Boyer
V. Chvátal
W. Hoeffding
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Crossref

Probabilistic methods for distributed information dissemination

Author: Haeupler Bernhard
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2013
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (p. 457-484).The ever-increasing growth of modern networks comes with a paradigm shift in network operation. Networks can no longer be abstracted as deterministic, centrally controlled systems with static topologies but need to be understood as highly distributed, dynamic systems with inherent unreliabilities. This makes many communication, coordination and computation tasks challenging and in many scenarios communication becomes a crucial bottleneck. In this thesis, we develop new algorithms and techniques to address these challenges. In particular we concentrate on broadcast and information dissemination tasks and introduce novel ideas on how randomization can lead to powerful, simple and practical communication primitives suitable for these modern networks. In this endeavor we combine and further develop tools from different disciplines trying to simultaneously addresses the distributed, information theoretic and algorithmic aspects of network communication. The two main probabilistic techniques developed to disseminate information in a network are gossip and random linear network coding. Gossip is an alternative to classical flooding approaches: Instead of nodes repeatedly forwarding information to all their neighbors, gossiping nodes forward information only to a small number of (random) neighbors. We show that, when done right, gossip disperses information almost as quickly as flooding, albeit with a drastically reduced communication overhead. Random linear network coding (RLNC) applies when a large amount of information or many messages are to be disseminated. Instead of routing messages through intermediate nodes, that is, following a classical store-and-forward approach, RLNC mixes messages together by forwarding random linear combinations of messages. The simplicity and topology-obliviousness of this approach makes RLNC particularly interesting for the distributed settings considered in this thesis. Unfortunately the performance of RLNC was not well understood even for the simplest such settings. We introduce a simple yet powerful analysis technique that allows us to prove optimal performance guarantees for all settings considered in the literature and many more that were not analyzable so far. Specifically, we give many new results for RLNC gossip algorithms, RLNC algorithms for dynamic networks, and RLNC with correlated data. We also provide a novel highly efficient distributed implementation of RLNC that achieves these performance guarantees while buffering only a minimal amount of information at intermediate nodes. We then apply our techniques to improve communication primitives in multi-hop radio networks. While radio networks inherently support broadcast communications, e.g., from one node to all surrounding nodes, interference of simultaneous transmissions makes multihop broadcast communication an interesting challenge. We show that, again, randomization holds the key for obtaining simple, efficient and distributed information dissemination protocols. In particular, using random back-off strategies to coordinate access to the shared medium leads to optimal gossip-like communications and applying RLNC achieves the first throughput-optimal multi-message communication primitives. Lastly we apply our probabilistic approach for analyzing simple, distributed propagation protocols in a broader context by studying algorithms for the Lovász Local Lemma. These algorithms find solutions to certain local constraint satisfaction problems by randomly fixing and propagating violations locally. Our two main results show that, firstly, there are also efficient deterministic propagation strategies achieving the same and, secondly, using the random fixing strategy has the advantage of producing not just an arbitrary solution but an approximately uniformly random one. Both results lead to simple, constructions for a many locally consistent structures of interest that were not known to be efficiently constructable before.by Bernhard Haeupler.Ph.D

DSpace@MIT

Actas da 10ª Conferência sobre Redes de Computadores

Author: Carvalho Paulo
Costa António
Dias Bruno
Lima Solange
Macedo Joaquim
Moreira Adriano
Nicolau Maria João
Santos Alexandre
Sousa Pedro
Publication venue: Universidade do Minho
Publication date: 01/01/2010
Field of study

Universidade do MinhoCCTCCentro AlgoritmiCisco SystemsIEEE Portugal Sectio

Universidade do Minho: RepositoriUM