242 research outputs found

    Sparser Johnson-Lindenstrauss Transforms

    Get PDF
    We give two different and simple constructions for dimensionality reduction in 2\ell_2 via linear mappings that are sparse: only an O(ε)O(\varepsilon)-fraction of entries in each column of our embedding matrices are non-zero to achieve distortion 1+ε1+\varepsilon with high probability, while still achieving the asymptotically optimal number of rows. These are the first constructions to provide subconstant sparsity for all values of parameters, improving upon previous works of Achlioptas (JCSS 2003) and Dasgupta, Kumar, and Sarl\'{o}s (STOC 2010). Such distributions can be used to speed up applications where 2\ell_2 dimensionality reduction is used.Comment: v6: journal version, minor changes, added Remark 23; v5: modified abstract, fixed typos, added open problem section; v4: simplified section 4 by giving 1 analysis that covers both constructions; v3: proof of Theorem 25 in v2 was written incorrectly, now fixed; v2: Added another construction achieving same upper bound, and added proof of near-tight lower bound for DKS schem

    Improved Differentially Private Euclidean Distance Approximation

    Get PDF

    Sketching via hashing: from heavy hitters to compressed sensing to sparse fourier transform

    Get PDF
    Sketching via hashing is a popular and useful method for processing large data sets. Its basic idea is as follows. Suppose that we have a large multi-set of elements S=[formula], and we would like to identify the elements that occur “frequently" in S. The algorithm starts by selecting a hash function h that maps the elements into an array c[1…m]. The array entries are initialized to 0. Then, for each element a ∈ S, the algorithm increments c[h(a)]. At the end of the process, each array entry c[j] contains the count of all data elements a ∈ S mapped to j

    Tighter Bounds on Johnson Lindenstrauss Transforms

    Get PDF
    Johnson and Lindenstrauss (1984) proved that any finite set of data in a high dimensional space can be projected into a low dimensional space with the Euclidean metric information of the set being preserved within any desired accuracy. Such dimension reduction plays a critical role in many applications with massive data. There has been extensive effort in the literature on how to find explicit constructions of Johnson-Lindenstrauss projections. In this poster, we show how algebraic codes over finite fields can be used for fast Johnson-Lindenstrauss projections of data in high dimensional Euclidean spaces
    corecore