526 research outputs found

    Near-Neighbor Preserving Dimension Reduction for Doubling Subsets of l_1

    Get PDF
    Randomized dimensionality reduction has been recognized as one of the fundamental techniques in handling high-dimensional data. Starting with the celebrated Johnson-Lindenstrauss Lemma, such reductions have been studied in depth for the Euclidean (l_2) metric, but much less for the Manhattan (l_1) metric. Our primary motivation is the approximate nearest neighbor problem in l_1. We exploit its reduction to the decision-with-witness version, called approximate near neighbor, which incurs a roughly logarithmic overhead. In 2007, Indyk and Naor, in the context of approximate nearest neighbors, introduced the notion of nearest neighbor-preserving embeddings. These are randomized embeddings between two metric spaces with guaranteed bounded distortion only for the distances between a query point and a point set. Such embeddings are known to exist for both l_2 and l_1 metrics, as well as for doubling subsets of l_2. The case that remained open were doubling subsets of l_1. In this paper, we propose a dimension reduction by means of a near neighbor-preserving embedding for doubling subsets of l_1. Our approach is to represent the pointset with a carefully chosen covering set, then randomly project the latter. We study two types of covering sets: c-approximate r-nets and randomly shifted grids, and we discuss the tradeoff between them in terms of preprocessing time and target dimension. We employ Cauchy variables: certain concentration bounds derived should be of independent interest

    Impossibility of dimension reduction in the nuclear norm

    Full text link
    Let S1\mathsf{S}_1 (the Schatten--von Neumann trace class) denote the Banach space of all compact linear operators T:β„“2β†’β„“2T:\ell_2\to \ell_2 whose nuclear norm βˆ₯Tβˆ₯S1=βˆ‘j=1βˆžΟƒj(T)\|T\|_{\mathsf{S}_1}=\sum_{j=1}^\infty\sigma_j(T) is finite, where {Οƒj(T)}j=1∞\{\sigma_j(T)\}_{j=1}^\infty are the singular values of TT. We prove that for arbitrarily large n∈Nn\in \mathbb{N} there exists a subset CβŠ†S1\mathcal{C}\subseteq \mathsf{S}_1 with ∣C∣=n|\mathcal{C}|=n that cannot be embedded with bi-Lipschitz distortion O(1)O(1) into any no(1)n^{o(1)}-dimensional linear subspace of S1\mathsf{S}_1. C\mathcal{C} is not even a O(1)O(1)-Lipschitz quotient of any subset of any no(1)n^{o(1)}-dimensional linear subspace of S1\mathsf{S}_1. Thus, S1\mathsf{S}_1 does not admit a dimension reduction result \'a la Johnson and Lindenstrauss (1984), which complements the work of Harrow, Montanaro and Short (2011) on the limitations of quantum dimension reduction under the assumption that the embedding into low dimensions is a quantum channel. Such a statement was previously known with S1\mathsf{S}_1 replaced by the Banach space β„“1\ell_1 of absolutely summable sequences via the work of Brinkman and Charikar (2003). In fact, the above set C\mathcal{C} can be taken to be the same set as the one that Brinkman and Charikar considered, viewed as a collection of diagonal matrices in S1\mathsf{S}_1. The challenge is to demonstrate that C\mathcal{C} cannot be faithfully realized in an arbitrary low-dimensional subspace of S1\mathsf{S}_1, while Brinkman and Charikar obtained such an assertion only for subspaces of S1\mathsf{S}_1 that consist of diagonal operators (i.e., subspaces of β„“1\ell_1). We establish this by proving that the Markov 2-convexity constant of any finite dimensional linear subspace XX of S1\mathsf{S}_1 is at most a universal constant multiple of log⁑dim(X)\sqrt{\log \mathrm{dim}(X)}

    Dimension Reduction Techniques for l_p (1<p<2), with Applications

    Get PDF
    For Euclidean space (l_2), there exists the powerful dimension reduction transform of Johnson and Lindenstrauss [Conf. in modern analysis and probability, AMS 1984], with a host of known applications. Here, we consider the problem of dimension reduction for all l_p spaces 1<p<2. Although strong lower bounds are known for dimension reduction in l_1, Ostrovsky and Rabani [JACM 2002] successfully circumvented these by presenting an l_1 embedding that maintains fidelity in only a bounded distance range, with applications to clustering and nearest neighbor search. However, their embedding techniques are specific to l_1 and do not naturally extend to other norms. In this paper, we apply a range of advanced techniques and produce bounded range dimension reduction embeddings for all of 1<p<2, thereby demonstrating that the approach initiated by Ostrovsky and Rabani for l_1 can be extended to a much more general framework. We also obtain improved bounds in terms of the intrinsic dimensionality. As a result we achieve improved bounds for proximity problems including snowflake embeddings and clustering

    Metric Embedding via Shortest Path Decompositions

    Full text link
    We study the problem of embedding shortest-path metrics of weighted graphs into β„“p\ell_p spaces. We introduce a new embedding technique based on low-depth decompositions of a graph via shortest paths. The notion of Shortest Path Decomposition depth is inductively defined: A (weighed) path graph has shortest path decomposition (SPD) depth 11. General graph has an SPD of depth kk if it contains a shortest path whose deletion leads to a graph, each of whose components has SPD depth at most kβˆ’1k-1. In this paper we give an O(kmin⁑{1p,12})O(k^{\min\{\frac{1}{p},\frac{1}{2}\}})-distortion embedding for graphs of SPD depth at most kk. This result is asymptotically tight for any fixed p>1p>1, while for p=1p=1 it is tight up to second order terms. As a corollary of this result, we show that graphs having pathwidth kk embed into β„“p\ell_p with distortion O(kmin⁑{1p,12})O(k^{\min\{\frac{1}{p},\frac{1}{2}\}}). For p=1p=1, this improves over the best previous bound of Lee and Sidiropoulos that was exponential in kk; moreover, for other values of pp it gives the first embeddings whose distortion is independent of the graph size nn. Furthermore, we use the fact that planar graphs have SPD depth O(log⁑n)O(\log n) to give a new proof that any planar graph embeds into β„“1\ell_1 with distortion O(log⁑n)O(\sqrt{\log n}). Our approach also gives new results for graphs with bounded treewidth, and for graphs excluding a fixed minor
    • …
    corecore