61,265 research outputs found

    Clustering affine subspaces: hardness and algorithms

    Get PDF
    We study a generalization of the famous k-center problem where each object is an affine subspace of dimension Δ, and give either the first or significantly improved algorithms and hardness results for many combinations of parameters. This generalization from points (Δ = 0) is motivated by the analysis of incomplete data, a pervasive challenge in statistics: incomplete data objects in ℝd can be modeled as affine subspaces. We give three algorithmic results for different values of k, under the assumption that all subspaces are axis-parallel, the main case of interest because of the correspondence to missing entries in data tables. 1) k = 1: Two polynomial time approximation schemes which runs in poly (Δ, 1/∊)nd. 2) k = 2: O(Δ1/4)-approximation algorithm which runs in poly(n, d, Δ) 3) General k: Polynomial time approximation scheme which runs in We also prove nearly matching hardness results; in both the general (not necessarily axis-parallel) case (for k ≥ 2) and in the axis-parallel case (for k ≥ 3), the running time of an approximation algorithm with any approximation ratio cannot be polynomial in even one of k and Δ, unless P = NP. Furthermore, assuming that the 3-SAT problem cannot be solved sub-exponentially, the dependence on both k and Δ must be exponential in the general case (in the axis-parallel case, only the dependence on k drops to . The simplicity of the first and the third algorithm suggests that they might be actually used in statistical applications. The second algorithm, which demonstrates a theoretical gap between the axis-parallel and general case for k = 2, displays a strong connection between geometric clustering and classical coloring problems on graphs and hypergraphs, via a new Helly-type theorem

    Geometry Helps to Compare Persistence Diagrams

    Full text link
    Exploiting geometric structure to improve the asymptotic complexity of discrete assignment problems is a well-studied subject. In contrast, the practical advantages of using geometry for such problems have not been explored. We implement geometric variants of the Hopcroft--Karp algorithm for bottleneck matching (based on previous work by Efrat el al.) and of the auction algorithm by Bertsekas for Wasserstein distance computation. Both implementations use k-d trees to replace a linear scan with a geometric proximity query. Our interest in this problem stems from the desire to compute distances between persistence diagrams, a problem that comes up frequently in topological data analysis. We show that our geometric matching algorithms lead to a substantial performance gain, both in running time and in memory consumption, over their purely combinatorial counterparts. Moreover, our implementation significantly outperforms the only other implementation available for comparing persistence diagrams.Comment: 20 pages, 10 figures; extended version of paper published in ALENEX 201

    Optimized normal and distance matching for heterogeneous object modeling

    Get PDF
    This paper presents a new optimization methodology of material blending for heterogeneous object modeling by matching the material governing features for designing a heterogeneous object. The proposed method establishes point-to-point correspondence represented by a set of connecting lines between two material directrices. To blend the material features between the directrices, a heuristic optimization method developed with the objective is to maximize the sum of the inner products of the unit normals at the end points of the connecting lines and minimize the sum of the lengths of connecting lines. The geometric features with material information are matched to generate non-self-intersecting and non-twisted connecting surfaces. By subdividing the connecting lines into equal number of segments, a series of intermediate piecewise curves are generated to represent the material metamorphosis between the governing material features. Alternatively, a dynamic programming approach developed in our earlier work is presented for comparison purposes. Result and computational efficiency of the proposed heuristic method is also compared with earlier techniques in the literature. Computer interface implementation and illustrative examples are also presented in this paper

    Solving a "Hard" Problem to Approximate an "Easy" One: Heuristics for Maximum Matchings and Maximum Traveling Salesman Problems

    Get PDF
    We consider geometric instances of the Maximum Weighted Matching Problem (MWMP) and the Maximum Traveling Salesman Problem (MTSP) with up to 3,000,000 vertices. Making use of a geometric duality relationship between MWMP, MTSP, and the Fermat-Weber-Problem (FWP), we develop a heuristic approach that yields in near-linear time solutions as well as upper bounds. Using various computational tools, we get solutions within considerably less than 1% of the optimum. An interesting feature of our approach is that, even though an FWP is hard to compute in theory and Edmonds' algorithm for maximum weighted matching yields a polynomial solution for the MWMP, the practical behavior is just the opposite, and we can solve the FWP with high accuracy in order to find a good heuristic solution for the MWMP.Comment: 20 pages, 14 figures, Latex, to appear in Journal of Experimental Algorithms, 200

    Tree-based Coarsening and Partitioning of Complex Networks

    Full text link
    Many applications produce massive complex networks whose analysis would benefit from parallel processing. Parallel algorithms, in turn, often require a suitable network partition. For solving optimization tasks such as graph partitioning on large networks, multilevel methods are preferred in practice. Yet, complex networks pose challenges to established multilevel algorithms, in particular to their coarsening phase. One way to specify a (recursive) coarsening of a graph is to rate its edges and then contract the edges as prioritized by the rating. In this paper we (i) define weights for the edges of a network that express the edges' importance for connectivity, (ii) compute a minimum weight spanning tree TmT^m with respect to these weights, and (iii) rate the network edges based on the conductance values of TmT^m's fundamental cuts. To this end, we also (iv) develop the first optimal linear-time algorithm to compute the conductance values of \emph{all} fundamental cuts of a given spanning tree. We integrate the new edge rating into a leading multilevel graph partitioner and equip the latter with a new greedy postprocessing for optimizing the maximum communication volume (MCV). Experiments on bipartitioning frequently used benchmark networks show that the postprocessing already reduces MCV by 11.3%. Our new edge rating further reduces MCV by 10.3% compared to the previously best rating with the postprocessing in place for both ratings. In total, with a modest increase in running time, our new approach reduces the MCV of complex network partitions by 20.4%
    corecore