16 research outputs found

    Approximation and Streaming Algorithms for Projective Clustering via Random Projections

    Full text link
    Let PP be a set of nn points in Rd\mathbb{R}^d. In the projective clustering problem, given k,qk, q and norm ρ[1,]\rho \in [1,\infty], we have to compute a set F\mathcal{F} of kk qq-dimensional flats such that (pPd(p,F)ρ)1/ρ(\sum_{p\in P}d(p, \mathcal{F})^\rho)^{1/\rho} is minimized; here d(p,F)d(p, \mathcal{F}) represents the (Euclidean) distance of pp to the closest flat in F\mathcal{F}. We let fkq(P,ρ)f_k^q(P,\rho) denote the minimal value and interpret fkq(P,)f_k^q(P,\infty) to be maxrPd(r,F)\max_{r\in P}d(r, \mathcal{F}). When ρ=1,2\rho=1,2 and \infty and q=0q=0, the problem corresponds to the kk-median, kk-mean and the kk-center clustering problems respectively. For every 0<ϵ<10 < \epsilon < 1, SPS\subset P and ρ1\rho \ge 1, we show that the orthogonal projection of PP onto a randomly chosen flat of dimension O(((q+1)2log(1/ϵ)/ϵ3)logn)O(((q+1)^2\log(1/\epsilon)/\epsilon^3) \log n) will ϵ\epsilon-approximate f1q(S,ρ)f_1^q(S,\rho). This result combines the concepts of geometric coresets and subspace embeddings based on the Johnson-Lindenstrauss Lemma. As a consequence, an orthogonal projection of PP to an O(((q+1)2log((q+1)/ϵ)/ϵ3)logn)O(((q+1)^2 \log ((q+1)/\epsilon)/\epsilon^3) \log n) dimensional randomly chosen subspace ϵ\epsilon-approximates projective clusterings for every kk and ρ\rho simultaneously. Note that the dimension of this subspace is independent of the number of clusters~kk. Using this dimension reduction result, we obtain new approximation and streaming algorithms for projective clustering problems. For example, given a stream of nn points, we show how to compute an ϵ\epsilon-approximate projective clustering for every kk and ρ\rho simultaneously using only O((n+d)((q+1)2log((q+1)/ϵ))/ϵ3logn)O((n+d)((q+1)^2\log ((q+1)/\epsilon))/\epsilon^3 \log n) space. Compared to standard streaming algorithms with Ω(kd)\Omega(kd) space requirement, our approach is a significant improvement when the number of input points and their dimensions are of the same order of magnitude.Comment: Canadian Conference on Computational Geometry (CCCG 2015

    A Robust and Optimal Online Algorithm for Minimum Metric Bipartite Matching

    Get PDF
    We study the Online Minimum Metric Bipartite Matching Problem. In this problem, we are given point sets S and R which correspond to the server and request locations; here |S|=|R|=n. All these locations are points from some metric space and the cost of matching a server to a request is given by the distance between their locations in this space. In this problem, the request points arrive one at a time. When a request arrives, we must immediately and irrevocably match it to a "free" server. The matching obtained after all the requests are processed is the online matching M. The cost of M is the sum of the cost of its edges. The performance of any online algorithm is the worst-case ratio of the cost of its online solution M to the minimum-cost matching. We present a deterministic online algorithm for this problem. Our algorithm is the first to simultaneously achieve optimal performances in the well-known adversarial and the random arrival models. For the adversarial model, we obtain a competitive ratio of 2n-1 + o(1); it is known that no deterministic algorithm can do better than 2n-1. In the random arrival model, our algorithm obtains a competitive ratio of 2H_n - 1 + o(1); where H_n is the n-th Harmonic number. We also prove that any online algorithm will have a competitive ratio of at least 2H_n - 1-o(1) in this model. We use a new variation of the offline primal-dual method for computing minimum cost matching to compute the online matching. Our primal-dual method is based on a relaxed linear-program. Under metric costs, this specific relaxation helps us relate the cost of the online matching with the offline matching leading to its robust properties

    Optimal Analysis of an Online Algorithm for the Bipartite Matching Problem on a Line

    Get PDF
    In the online metric bipartite matching problem, we are given a set S of server locations in a metric space. Requests arrive one at a time, and on its arrival, we need to immediately and irrevocably match it to a server at a cost which is equal to the distance between these locations. A alpha-competitive algorithm will assign requests to servers so that the total cost is at most alpha times the cost of M_{Opt} where M_{Opt} is the minimum cost matching between S and R. We consider this problem in the adversarial model for the case where S and R are points on a line and |S|=|R|=n. We improve the analysis of the deterministic Robust Matching Algorithm (RM-Algorithm, Nayyar and Raghvendra FOCS\u2717) from O(log^2 n) to an optimal Theta(log n). Previously, only a randomized algorithm under a weaker oblivious adversary achieved a competitive ratio of O(log n) (Gupta and Lewi, ICALP\u2712). The well-known Work Function Algorithm (WFA) has a competitive ratio of O(n) and Omega(log n) for this problem. Therefore, WFA cannot achieve an asymptotically better competitive ratio than the RM-Algorithm

    Polynomial-Sized Topological Approximations Using The Permutahedron

    Get PDF
    Classical methods to model topological properties of point clouds, such as the Vietoris-Rips complex, suffer from the combinatorial explosion of complex sizes. We propose a novel technique to approximate a multi-scale filtration of the Rips complex with improved bounds for size: precisely, for nn points in Rd\mathbb{R}^d, we obtain a O(d)O(d)-approximation with at most n2O(dlogk)n2^{O(d \log k)} simplices of dimension kk or lower. In conjunction with dimension reduction techniques, our approach yields a O(polylog(n))O(\mathrm{polylog} (n))-approximation of size nO(1)n^{O(1)} for Rips filtrations on arbitrary metric spaces. This result stems from high-dimensional lattice geometry and exploits properties of the permutahedral lattice, a well-studied structure in discrete geometry. Building on the same geometric concept, we also present a lower bound result on the size of an approximate filtration: we construct a point set for which every (1+ϵ)(1+\epsilon)-approximation of the \v{C}ech filtration has to contain nΩ(loglogn)n^{\Omega(\log\log n)} features, provided that ϵ<1log1+cn\epsilon <\frac{1}{\log^{1+c} n} for c(0,1)c\in(0,1).Comment: 24 pages, 1 figur

    A Weighted Approach to the Maximum Cardinality Bipartite Matching Problem with Applications in Geometric Settings

    Get PDF
    We present a weighted approach to compute a maximum cardinality matching in an arbitrary bipartite graph. Our main result is a new algorithm that takes as input a weighted bipartite graph G(A cup B,E) with edge weights of 0 or 1. Let w <= n be an upper bound on the weight of any matching in G. Consider the subgraph induced by all the edges of G with a weight 0. Suppose every connected component in this subgraph has O(r) vertices and O(mr/n) edges. We present an algorithm to compute a maximum cardinality matching in G in O~(m(sqrt{w} + sqrt{r} + wr/n)) time. When all the edge weights are 1 (symmetrically when all weights are 0), our algorithm will be identical to the well-known Hopcroft-Karp (HK) algorithm, which runs in O(m sqrt{n}) time. However, if we can carefully assign weights of 0 and 1 on its edges such that both w and r are sub-linear in n and wr=O(n^{gamma}) for gamma < 3/2, then we can compute maximum cardinality matching in G in o(m sqrt{n}) time. Using our algorithm, we obtain a new O~(n^{4/3}/epsilon^4) time algorithm to compute an epsilon-approximate bottleneck matching of A,B subsetR^2 and an 1/(epsilon^{O(d)}}n^{1+(d-1)/(2d-1)}) poly log n time algorithm for computing epsilon-approximate bottleneck matching in d-dimensions. All previous algorithms take Omega(n^{3/2}) time. Given any graph G(A cup B,E) that has an easily computable balanced vertex separator for every subgraph G\u27(V\u27,E\u27) of size |V\u27|^{delta}, for delta in [1/2,1), we can apply our algorithm to compute a maximum matching in O~(mn^{delta/1+delta}) time improving upon the O(m sqrt{n}) time taken by the HK-Algorithm

    Improved Approximate Rips Filtrations with Shifted Integer Lattices

    Get PDF
    Rips complexes are important structures for analyzing topological features of metric spaces. Unfortunately, generating these complexes constitutes an expensive task because of a combinatorial explosion in the complex size. For n points in R^d, we present a scheme to construct a 4.24-approximation of the multi-scale filtration of the Rips complex in the L-infinity metric, which extends to a O(d^{0.25})-approximation of the Rips filtration for the Euclidean case. The k-skeleton of the resulting approximation has a total size of n2^{O(d log k)}. The scheme is based on the integer lattice and on the barycentric subdivision of the d-cube
    corecore