2,918 research outputs found
Optimal Lower Bound for Itemset Frequency Indicator Sketches
Given a database, a common problem is to find the pairs or -tuples of
items that frequently co-occur. One specific problem is to create a small space
"sketch" of the data that records which -tuples appear in more than an
fraction of rows of the database.
We improve the lower bound of Liberty, Mitzenmacher, and Thaler [LMT14],
showing that bits are necessary
even in the case of . This matches the sampling upper bound for all
, and (in the case of ) another trivial upper
bound for .Comment: 3 page
The Noisy Power Method: A Meta Algorithm with Applications
We provide a new robust convergence analysis of the well-known power method
for computing the dominant singular vectors of a matrix that we call the noisy
power method. Our result characterizes the convergence behavior of the
algorithm when a significant amount noise is introduced after each
matrix-vector multiplication. The noisy power method can be seen as a
meta-algorithm that has recently found a number of important applications in a
broad range of machine learning problems including alternating minimization for
matrix completion, streaming principal component analysis (PCA), and
privacy-preserving spectral analysis. Our general analysis subsumes several
existing ad-hoc convergence bounds and resolves a number of open problems in
multiple applications including streaming PCA and privacy-preserving singular
vector computation.Comment: NIPS 201
Respiratory Medication Adherence : Toward a Common Language and a Shared Vision
Part of this work, conducted by E. Van Ganse, has been performed in the context of the ASTRO-LAB project, which received funding from the European Community's 7th Framework (FP7/2007-2013) under grant agreement no. 282593. Teva supported the meeting costs at which the concepts in this paper were discussed by the co-authors and the open access publication fee for this article. The authors had full editorial control over the ideas presented.Peer reviewedPublisher PD
Robust polynomial regression up to the information theoretic limit
We consider the problem of robust polynomial regression, where one receives
samples that are usually within of a polynomial , but have a chance of being arbitrary adversarial outliers.
Previously, it was known how to efficiently estimate only when . We give an algorithm that works for the entire feasible
range of , while simultaneously improving other parameters of the
problem. We complement our algorithm, which gives a factor 2 approximation,
with impossibility results that show, for example, that a approximation
is impossible even with infinitely many samples.Comment: 19 Pages. To appear in FOCS 201
The Sketching Complexity of Graph and Hypergraph Counting
Subgraph counting is a fundamental primitive in graph processing, with
applications in social network analysis (e.g., estimating the clustering
coefficient of a graph), database processing and other areas. The space
complexity of subgraph counting has been studied extensively in the literature,
but many natural settings are still not well understood. In this paper we
revisit the subgraph (and hypergraph) counting problem in the sketching model,
where the algorithm's state as it processes a stream of updates to the graph is
a linear function of the stream. This model has recently received a lot of
attention in the literature, and has become a standard model for solving
dynamic graph streaming problems.
In this paper we give a tight bound on the sketching complexity of counting
the number of occurrences of a small subgraph in a bounded degree graph
presented as a stream of edge updates. Specifically, we show that the space
complexity of the problem is governed by the fractional vertex cover number of
the graph . Our subgraph counting algorithm implements a natural vertex
sampling approach, with sampling probabilities governed by the vertex cover of
. Our main technical contribution lies in a new set of Fourier analytic
tools that we develop to analyze multiplayer communication protocols in the
simultaneous communication model, allowing us to prove a tight lower bound. We
believe that our techniques are likely to find applications in other settings.
Besides giving tight bounds for all graphs , both our algorithm and lower
bounds extend to the hypergraph setting, albeit with some loss in space
complexity
On the Power of Adaptivity in Sparse Recovery
The goal of (stable) sparse recovery is to recover a -sparse approximation
of a vector from linear measurements of . Specifically, the goal is
to recover such that ||x-x*||_p <= C min_{k-sparse x'} ||x-x'||_q for some
constant and norm parameters and . It is known that, for or
, this task can be accomplished using non-adaptive
measurements [CRT06] and that this bound is tight [DIPW10,FPRU10,PW11].
In this paper we show that if one is allowed to perform measurements that are
adaptive, then the number of measurements can be considerably reduced.
Specifically, for and we show - A scheme with measurements that uses
rounds. This is a significant improvement over the best possible non-adaptive
bound. - A scheme with measurements
that uses /two/ rounds. This improves over the best possible non-adaptive
bound. To the best of our knowledge, these are the first results of this type.
As an independent application, we show how to solve the problem of finding a
duplicate in a data stream of items drawn from using
bits of space and passes, improving over the best
possible space complexity achievable using a single pass.Comment: 18 pages; appearing at FOCS 201
- …