6,101 research outputs found
Approximation algorithms for wavelet transform coding of data streams
This paper addresses the problem of finding a B-term wavelet representation
of a given discrete function whose distance from f is
minimized. The problem is well understood when we seek to minimize the
Euclidean distance between f and its representation. The first known algorithms
for finding provably approximate representations minimizing general
distances (including ) under a wide variety of compactly supported
wavelet bases are presented in this paper. For the Haar basis, a polynomial
time approximation scheme is demonstrated. These algorithms are applicable in
the one-pass sublinear-space data stream model of computation. They generalize
naturally to multiple dimensions and weighted norms. A universal representation
that provides a provable approximation guarantee under all p-norms
simultaneously; and the first approximation algorithms for bit-budget versions
of the problem, known as adaptive quantization, are also presented. Further, it
is shown that the algorithms presented here can be used to select a basis from
a tree-structured dictionary of bases and find a B-term representation of the
given function that provably approximates its best dictionary-basis
representation.Comment: Added a universal representation that provides a provable
approximation guarantee under all p-norms simultaneousl
Towards Optimal Moment Estimation in Streaming and Distributed Models
One of the oldest problems in the data stream model is to approximate the p-th moment ||X||_p^p = sum_{i=1}^n X_i^p of an underlying non-negative vector X in R^n, which is presented as a sequence of poly(n) updates to its coordinates. Of particular interest is when p in (0,2]. Although a tight space bound of Theta(epsilon^-2 log n) bits is known for this problem when both positive and negative updates are allowed, surprisingly there is still a gap in the space complexity of this problem when all updates are positive. Specifically, the upper bound is O(epsilon^-2 log n) bits, while the lower bound is only Omega(epsilon^-2 + log n) bits. Recently, an upper bound of O~(epsilon^-2 + log n) bits was obtained under the assumption that the updates arrive in a random order.
We show that for p in (0, 1], the random order assumption is not needed. Namely, we give an upper bound for worst-case streams of O~(epsilon^-2 + log n) bits for estimating |X |_p^p. Our techniques also give new upper bounds for estimating the empirical entropy in a stream. On the other hand, we show that for p in (1,2], in the natural coordinator and blackboard distributed communication topologies, there is an O~(epsilon^-2) bit max-communication upper bound based on a randomized rounding scheme. Our protocols also give rise to protocols for heavy hitters and approximate matrix product. We generalize our results to arbitrary communication topologies G, obtaining an O~(epsilon^2 log d) max-communication upper bound, where d is the diameter of G. Interestingly, our upper bound rules out natural communication complexity-based approaches for proving an Omega(epsilon^-2 log n) bit lower bound for p in (1,2] for streaming algorithms. In particular, any such lower bound must come from a topology with large diameter
- …