11 research outputs found

    On Deterministic Sketching and Streaming for Sparse Recovery and Norm Estimation

    Get PDF
    We study classic streaming and sparse recovery problems using deterministic linear sketches, including ℓ1/ℓ1\ell_1/\ell_1 and ℓ∞/ℓ1\ell_{\infty}/\ell_1 sparse recovery problems (the latter also being known as ℓ1ℓ1-heavy hitters), norm estimation, and approximate inner product. We focus on devising a fixed matrix AϵRm×nA \epsilon \mathbb{R}^{m \times n} and a deterministic recovery/estimation procedure which work for all possible input vectors simultaneously. Our results improve upon existing work, the following being our main contributions: • A proof that ℓ∞/ℓ1\ell_{\infty}/\ell_1 sparse recovery and inner product estimation are equivalent, and that incoherent matrices can be used to solve both problems. Our upper bound for the number of measurements is m=O(ε−2min{logn,(logn/log(1/ε))2})m=O(\varepsilon^{-2}min\{log n,(log n/log(1/\varepsilon))^2\}). We can also obtain fast sketching and recovery algorithms by making use of the Fast Johnson–Lindenstrauss transform. Both our running times and number of measurements improve upon previous work. We can also obtain better error guarantees than previous work in terms of a smaller tail of the input vector. • A new lower bound for the number of linear measurements required to solve ℓ1/ℓ1\ell_1/\ell_1 sparse recovery. We show Ω(k/ε2+klog(n/k)/ε)\Omega(k/\varepsilon^2+k log(n/k)/\varepsilon) measurements are required to recover an x′ with ‖x−x′‖1≤(1+ε)‖xtail(k)‖1‖x-x′‖_1\leq(1+\varepsilon)‖x_{tail(k)}‖_1, where xtail(k)x_{tail(k)} is x projected onto all but its largest k coordinates in magnitude. • A tight bound of m=θ(ε−2log(ε2n))m=\theta(\varepsilon^{-2}log(\varepsilon^2n)) on the number of measurements required to solve deterministic norm estimation, i.e., to recover ‖x‖2±ε‖x‖1‖x‖_2\pm\varepsilon‖x‖_1. For all the problems we study, tight bounds are already known for the randomized complexity from previous work, except in the case of ℓ1/ℓ1\ell_1/\ell_1 sparse recovery, where a nearly tight bound is known. Our work thus aims to study the deterministic complexities of these problems. We remark that some of the matrices used in our algorithms, although known to exist, currently are not yet explicit in the sense that deterministic polynomial time constructions are not yet known, although in all cases polynomial time Monte Carlo algorithms are known.Engineering and Applied Science

    The Power Of Locality In Network Algorithms

    Get PDF
    Over the last decade we have witnessed the rapid proliferation of large-scale complex networks, spanning many social, information and technological domains. While many of the tasks which users of such networks face are essentially global and involve the network as a whole, the size of these networks is huge and the information available to users is only local. In this dissertation we show that even when faced with stringent locality constraints, one can still effectively solve prominent algorithmic problems on such networks. In the first part of the dissertation we present a natural algorithmic framework designed to model the behaviour of an external agent trying to solve a network optimization problem with limited access to the network data. Our study focuses on local information algorithms --- sequential algorithms where the network topology is initially unknown and is revealed only within a local neighborhood of vertices that have been irrevocably added to the output set. We address both network coverage problems as well as network search problems. Our results include local information algorithms for coverage problems whose performance closely match the best possible even when information about network structure is unrestricted. We also demonstrate a sharp threshold on the level of visibility required: at a certain visibility level it is possible to design algorithms that nearly match the best approximation possible even with full access to the network structure, but with any less information it is impossible to achieve a reasonable approximation. For preferential attachment networks, we obtain polylogarithmic approximations to the problem of finding the smallest subgraph that connects a subset of nodes and the problem of finding the highest-degree nodes. This is achieved by addressing a decade-old open question of Bollobås and Riordan on locally finding the root in a preferential attachment process. In the second part of the dissertation we focus on designing highly time efficient local algorithms for central mining problems on complex networks that have been in the focus of the research community over a decade: finding a small set of influential nodes in the network, and fast ranking of nodes. Among our results is an essentially runtime-optimal local algorithm for the influence maximization problem in the standard independent cascades model of information diffusion and an essentially runtime-optimal local algorithm for the problem of returning all nodes with PageRank bigger than a given threshold. Our work demonstrates that locality is powerful enough to allow efficient solutions to many central algorithmic problems on complex networks

    Randomized Primitives for Big Data Processing

    Get PDF

    Proceedings of the 26th International Symposium on Theoretical Aspects of Computer Science (STACS'09)

    Get PDF
    The Symposium on Theoretical Aspects of Computer Science (STACS) is held alternately in France and in Germany. The conference of February 26-28, 2009, held in Freiburg, is the 26th in this series. Previous meetings took place in Paris (1984), Saarbr¨ucken (1985), Orsay (1986), Passau (1987), Bordeaux (1988), Paderborn (1989), Rouen (1990), Hamburg (1991), Cachan (1992), W¨urzburg (1993), Caen (1994), M¨unchen (1995), Grenoble (1996), L¨ubeck (1997), Paris (1998), Trier (1999), Lille (2000), Dresden (2001), Antibes (2002), Berlin (2003), Montpellier (2004), Stuttgart (2005), Marseille (2006), Aachen (2007), and Bordeaux (2008). ..

    Sparse recovery and Fourier sampling

    Get PDF
    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 155-160).In the last decade a broad literature has arisen studying sparse recovery, the estimation of sparse vectors from low dimensional linear projections. Sparse recovery has a wide variety of applications such as streaming algorithms, image acquisition, and disease testing. A particularly important subclass of sparse recovery is the sparse Fourier transform, which considers the computation of a discrete Fourier transform when the output is sparse. Applications of the sparse Fourier transform include medical imaging, spectrum sensing, and purely computation tasks involving convolution. This thesis describes a coherent set of techniques that achieve optimal or near-optimal upper and lower bounds for a variety of sparse recovery problems. We give the following state-of-the-art algorithms for recovery of an approximately k-sparse vector in n dimensions: -- Two sparse Fourier transform algorithms, respectively taking ... time and ... samples. The latter is within log e log n of the optimal sample complexity when ... -- An algorithm for adaptive sparse recovery using ... measurements, showing that adaptivity can give substantial improvements when k is small. -- An algorithm for C-approximate sparse recovery with ... measurements, which matches our lower bound up to the log* k factor and gives the first improvement for ... In the second part of this thesis, we give lower bounds for the above problems and more.by Eric Price.Ph. D

    27th Annual European Symposium on Algorithms: ESA 2019, September 9-11, 2019, Munich/Garching, Germany

    Get PDF

    Probabilistic methods for distributed information dissemination

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (p. 457-484).The ever-increasing growth of modern networks comes with a paradigm shift in network operation. Networks can no longer be abstracted as deterministic, centrally controlled systems with static topologies but need to be understood as highly distributed, dynamic systems with inherent unreliabilities. This makes many communication, coordination and computation tasks challenging and in many scenarios communication becomes a crucial bottleneck. In this thesis, we develop new algorithms and techniques to address these challenges. In particular we concentrate on broadcast and information dissemination tasks and introduce novel ideas on how randomization can lead to powerful, simple and practical communication primitives suitable for these modern networks. In this endeavor we combine and further develop tools from different disciplines trying to simultaneously addresses the distributed, information theoretic and algorithmic aspects of network communication. The two main probabilistic techniques developed to disseminate information in a network are gossip and random linear network coding. Gossip is an alternative to classical flooding approaches: Instead of nodes repeatedly forwarding information to all their neighbors, gossiping nodes forward information only to a small number of (random) neighbors. We show that, when done right, gossip disperses information almost as quickly as flooding, albeit with a drastically reduced communication overhead. Random linear network coding (RLNC) applies when a large amount of information or many messages are to be disseminated. Instead of routing messages through intermediate nodes, that is, following a classical store-and-forward approach, RLNC mixes messages together by forwarding random linear combinations of messages. The simplicity and topology-obliviousness of this approach makes RLNC particularly interesting for the distributed settings considered in this thesis. Unfortunately the performance of RLNC was not well understood even for the simplest such settings. We introduce a simple yet powerful analysis technique that allows us to prove optimal performance guarantees for all settings considered in the literature and many more that were not analyzable so far. Specifically, we give many new results for RLNC gossip algorithms, RLNC algorithms for dynamic networks, and RLNC with correlated data. We also provide a novel highly efficient distributed implementation of RLNC that achieves these performance guarantees while buffering only a minimal amount of information at intermediate nodes. We then apply our techniques to improve communication primitives in multi-hop radio networks. While radio networks inherently support broadcast communications, e.g., from one node to all surrounding nodes, interference of simultaneous transmissions makes multihop broadcast communication an interesting challenge. We show that, again, randomization holds the key for obtaining simple, efficient and distributed information dissemination protocols. In particular, using random back-off strategies to coordinate access to the shared medium leads to optimal gossip-like communications and applying RLNC achieves the first throughput-optimal multi-message communication primitives. Lastly we apply our probabilistic approach for analyzing simple, distributed propagation protocols in a broader context by studying algorithms for the LovĂĄsz Local Lemma. These algorithms find solutions to certain local constraint satisfaction problems by randomly fixing and propagating violations locally. Our two main results show that, firstly, there are also efficient deterministic propagation strategies achieving the same and, secondly, using the random fixing strategy has the advantage of producing not just an arbitrary solution but an approximately uniformly random one. Both results lead to simple, constructions for a many locally consistent structures of interest that were not known to be efficiently constructable before.by Bernhard Haeupler.Ph.D

    Actas da 10ÂŞ ConferĂŞncia sobre Redes de Computadores

    Get PDF
    Universidade do MinhoCCTCCentro AlgoritmiCisco SystemsIEEE Portugal Sectio
    corecore