31 research outputs found

    Sketching via hashing: from heavy hitters to compressed sensing to sparse fourier transform

    Get PDF
    Sketching via hashing is a popular and useful method for processing large data sets. Its basic idea is as follows. Suppose that we have a large multi-set of elements S=[formula], and we would like to identify the elements that occur “frequently" in S. The algorithm starts by selecting a hash function h that maps the elements into an array c[1…m]. The array entries are initialized to 0. Then, for each element a ∈ S, the algorithm increments c[h(a)]. At the end of the process, each array entry c[j] contains the count of all data elements a ∈ S mapped to j

    Nearly Optimal Deterministic Algorithm for Sparse Walsh-Hadamard Transform

    Get PDF
    For every fixed constant α>0\alpha > 0, we design an algorithm for computing the kk-sparse Walsh-Hadamard transform of an NN-dimensional vector xRNx \in \mathbb{R}^N in time k1+α(logN)O(1)k^{1+\alpha} (\log N)^{O(1)}. Specifically, the algorithm is given query access to xx and computes a kk-sparse x~RN\tilde{x} \in \mathbb{R}^N satisfying x~x^1cx^Hk(x^)1\|\tilde{x} - \hat{x}\|_1 \leq c \|\hat{x} - H_k(\hat{x})\|_1, for an absolute constant c>0c > 0, where x^\hat{x} is the transform of xx and Hk(x^)H_k(\hat{x}) is its best kk-sparse approximation. Our algorithm is fully deterministic and only uses non-adaptive queries to xx (i.e., all queries are determined and performed in parallel when the algorithm starts). An important technical tool that we use is a construction of nearly optimal and linear lossless condensers which is a careful instantiation of the GUV condenser (Guruswami, Umans, Vadhan, JACM 2009). Moreover, we design a deterministic and non-adaptive 1/1\ell_1/\ell_1 compressed sensing scheme based on general lossless condensers that is equipped with a fast reconstruction algorithm running in time k1+α(logN)O(1)k^{1+\alpha} (\log N)^{O(1)} (for the GUV-based condenser) and is of independent interest. Our scheme significantly simplifies and improves an earlier expander-based construction due to Berinde, Gilbert, Indyk, Karloff, Strauss (Allerton 2008). Our methods use linear lossless condensers in a black box fashion; therefore, any future improvement on explicit constructions of such condensers would immediately translate to improved parameters in our framework (potentially leading to k(logN)O(1)k (\log N)^{O(1)} reconstruction time with a reduced exponent in the poly-logarithmic factor, and eliminating the extra parameter α\alpha). Finally, by allowing the algorithm to use randomness, while still using non-adaptive queries, the running time of the algorithm can be improved to O~(klog3N)\tilde{O}(k \log^3 N)
    corecore