31 research outputs found
Sketching via hashing: from heavy hitters to compressed sensing to sparse fourier transform
Sketching via hashing is a popular and useful method for processing large data sets. Its basic idea is as follows. Suppose that we have a large multi-set of elements S=[formula], and we would like to identify the elements that occur “frequently" in S. The algorithm starts by selecting a hash function h that maps the elements into an array c[1…m]. The array entries are initialized to 0. Then, for each element a ∈ S, the algorithm increments c[h(a)]. At the end of the process, each array entry c[j] contains the count of all data elements a ∈ S mapped to j
Nearly Optimal Deterministic Algorithm for Sparse Walsh-Hadamard Transform
For every fixed constant , we design an algorithm for computing
the -sparse Walsh-Hadamard transform of an -dimensional vector in time . Specifically, the
algorithm is given query access to and computes a -sparse satisfying , for an absolute constant , where is the
transform of and is its best -sparse approximation. Our
algorithm is fully deterministic and only uses non-adaptive queries to
(i.e., all queries are determined and performed in parallel when the algorithm
starts).
An important technical tool that we use is a construction of nearly optimal
and linear lossless condensers which is a careful instantiation of the GUV
condenser (Guruswami, Umans, Vadhan, JACM 2009). Moreover, we design a
deterministic and non-adaptive compressed sensing scheme based
on general lossless condensers that is equipped with a fast reconstruction
algorithm running in time (for the GUV-based
condenser) and is of independent interest. Our scheme significantly simplifies
and improves an earlier expander-based construction due to Berinde, Gilbert,
Indyk, Karloff, Strauss (Allerton 2008).
Our methods use linear lossless condensers in a black box fashion; therefore,
any future improvement on explicit constructions of such condensers would
immediately translate to improved parameters in our framework (potentially
leading to reconstruction time with a reduced exponent in
the poly-logarithmic factor, and eliminating the extra parameter ).
Finally, by allowing the algorithm to use randomness, while still using
non-adaptive queries, the running time of the algorithm can be improved to