16 research outputs found

    Sketching via hashing: from heavy hitters to compressed sensing to sparse fourier transform

    Get PDF
    Sketching via hashing is a popular and useful method for processing large data sets. Its basic idea is as follows. Suppose that we have a large multi-set of elements S=[formula], and we would like to identify the elements that occur “frequently" in S. The algorithm starts by selecting a hash function h that maps the elements into an array c[1…m]. The array entries are initialized to 0. Then, for each element a ∈ S, the algorithm increments c[h(a)]. At the end of the process, each array entry c[j] contains the count of all data elements a ∈ S mapped to j

    New Constructions of RIP Matrices with Fast Multiplication and Fewer Rows

    Get PDF
    In this paper, we present novel constructions of matrices with the restricted isometry property (RIP) that support fast matrix-vector multiplication. Our guarantees are the best known, and can also be used to obtain the best known guarantees for fast Johnson Lindenstrauss transforms. In compressed sensing, the restricted isometry property is a sufficient condition for the efficient reconstruction of a nearly k-sparse vector xCdx \in \mathbb{C}^d from m linear measurements Φx\Phi x. It is desirable for m to be small, and further it is desirable for Φ\Phi to support fast matrix-vector multiplication. Among other applications, fast multiplication improves the runtime of iterative recovery algorithms which repeatedly multiply by Φ\Phi or Φ\Phi^*. The main contribution of this work is a novel randomized construction of RIP matrices ΦCm×d\Phi \in \mathbb{C}^{m×d}, preserving the 2\ell_2 norms of all k-sparse vectors with distortion 1+ϵ1 + \epsilon, where the matrix-vector multiply Φx\Phi x can be computed in nearly linear time. The number of rows m is on the order of ϵ2klogdlog2(klogd)\epsilon^{−2}klogd log^2(klogd), an improvement on previous analyses by a logarithmic factor. Our construction, together with a connection between RIP matrices and the Johnson-Lindenstrauss lemma in [Krahmer-Ward, SIAM. J. Math. Anal. 2011], also implies fast Johnson-Lindenstrauss embeddings with asymptotically fewer rows than previously known. Our construction is actually a recipe for improving any existing family of RIP matrices. Briefly, we apply an appropriate sparse hash matrix with sign flips to any suitable family of RIP matrices. We show that the embedding properties of the original family are maintained, while at the same time improving the number of rows. The main tool in our analysis is a recent bound for the supremum of certain types of Rademacher chaos processes in [Krahmer-Mendelson-Rauhut, Comm. Pure Appl. Math. to appear].Engineering and Applied Science

    Toward a unified theory of sparse dimensionality reduction in Euclidean space

    Get PDF
    Let ΦRm×n\Phi\in\mathbb{R}^{m\times n} be a sparse Johnson-Lindenstrauss transform [KN14] with ss non-zeroes per column. For a subset TT of the unit sphere, ε(0,1/2)\varepsilon\in(0,1/2) given, we study settings for m,sm,s required to ensure EΦsupxTΦx221<ε, \mathop{\mathbb{E}}_\Phi \sup_{x\in T} \left|\|\Phi x\|_2^2 - 1 \right| < \varepsilon , i.e. so that Φ\Phi preserves the norm of every xTx\in T simultaneously and multiplicatively up to 1+ε1+\varepsilon. We introduce a new complexity parameter, which depends on the geometry of TT, and show that it suffices to choose ss and mm such that this parameter is small. Our result is a sparse analog of Gordon's theorem, which was concerned with a dense Φ\Phi having i.i.d. Gaussian entries. We qualitatively unify several results related to the Johnson-Lindenstrauss lemma, subspace embeddings, and Fourier-based restricted isometries. Our work also implies new results in using the sparse Johnson-Lindenstrauss transform in numerical linear algebra, classical and model-based compressed sensing, manifold learning, and constrained least squares problems such as the Lasso

    Recovering the Optimal Solution by Dual Random Projection

    Full text link
    Random projection has been widely used in data classification. It maps high-dimensional data into a low-dimensional subspace in order to reduce the computational cost in solving the related optimization problem. While previous studies are focused on analyzing the classification performance of using random projection, in this work, we consider the recovery problem, i.e., how to accurately recover the optimal solution to the original optimization problem in the high-dimensional space based on the solution learned from the subspace spanned by random projections. We present a simple algorithm, termed Dual Random Projection, that uses the dual solution of the low-dimensional optimization problem to recover the optimal solution to the original problem. Our theoretical analysis shows that with a high probability, the proposed algorithm is able to accurately recover the optimal solution to the original problem, provided that the data matrix is of low rank or can be well approximated by a low rank matrix.Comment: The 26th Annual Conference on Learning Theory (COLT 2013
    corecore