849 research outputs found

    Asymptotic of geometrical navigation on a random set of points of the plane

    Full text link
    A navigation on a set of points SS is a rule for choosing which point to move to from the present point in order to progress toward a specified target. We study some navigations in the plane where SS is a non uniform Poisson point process (in a finite domain) with intensity going to ++\infty. We show the convergence of the traveller path lengths, the number of stages done, and the geometry of the traveller trajectories, uniformly for all starting points and targets, for several navigations of geometric nature. Other costs are also considered. This leads to asymptotic results on the stretch factors of random Yao-graphs and random θ\theta-graphs

    On the connectivity of visibility graphs

    Get PDF
    The visibility graph of a finite set of points in the plane has the points as vertices and an edge between two vertices if the line segment between them contains no other points. This paper establishes bounds on the edge- and vertex-connectivity of visibility graphs. Unless all its vertices are collinear, a visibility graph has diameter at most 2, and so it follows by a result of Plesn\'ik (1975) that its edge-connectivity equals its minimum degree. We strengthen the result of Plesn\'ik by showing that for any two vertices v and w in a graph of diameter 2, if deg(v) <= deg(w) then there exist deg(v) edge-disjoint vw-paths of length at most 4. Furthermore, we find that in visibility graphs every minimum edge cut is the set of edges incident to a vertex of minimum degree. For vertex-connectivity, we prove that every visibility graph with n vertices and at most l collinear vertices has connectivity at least (n-1)/(l-1), which is tight. We also prove the qualitatively stronger result that the vertex-connectivity is at least half the minimum degree. Finally, in the case that l=4 we improve this bound to two thirds of the minimum degree.Comment: 16 pages, 8 figure

    Objective-Based Hierarchical Clustering of Deep Embedding Vectors

    Full text link
    We initiate a comprehensive experimental study of objective-based hierarchical clustering methods on massive datasets consisting of deep embedding vectors from computer vision and NLP applications. This includes a large variety of image embedding (ImageNet, ImageNetV2, NaBirds), word embedding (Twitter, Wikipedia), and sentence embedding (SST-2) vectors from several popular recent models (e.g. ResNet, ResNext, Inception V3, SBERT). Our study includes datasets with up to 4.54.5 million entries with embedding dimensions up to 20482048. In order to address the challenge of scaling up hierarchical clustering to such large datasets we propose a new practical hierarchical clustering algorithm B++&C. It gives a 5%/20% improvement on average for the popular Moseley-Wang (MW) / Cohen-Addad et al. (CKMM) objectives (normalized) compared to a wide range of classic methods and recent heuristics. We also introduce a theoretical algorithm B2SAT&C which achieves a 0.740.74-approximation for the CKMM objective in polynomial time. This is the first substantial improvement over the trivial 2/32/3-approximation achieved by a random binary tree. Prior to this work, the best poly-time approximation of 2/3+0.0004\approx 2/3 + 0.0004 was due to Charikar et al. (SODA'19)

    Nonparametric Feature Extraction from Dendrograms

    Full text link
    We propose feature extraction from dendrograms in a nonparametric way. The Minimax distance measures correspond to building a dendrogram with single linkage criterion, with defining specific forms of a level function and a distance function over that. Therefore, we extend this method to arbitrary dendrograms. We develop a generalized framework wherein different distance measures can be inferred from different types of dendrograms, level functions and distance functions. Via an appropriate embedding, we compute a vector-based representation of the inferred distances, in order to enable many numerical machine learning algorithms to employ such distances. Then, to address the model selection problem, we study the aggregation of different dendrogram-based distances respectively in solution space and in representation space in the spirit of deep representations. In the first approach, for example for the clustering problem, we build a graph with positive and negative edge weights according to the consistency of the clustering labels of different objects among different solutions, in the context of ensemble methods. Then, we use an efficient variant of correlation clustering to produce the final clusters. In the second approach, we investigate the sequential combination of different distances and features sequentially in the spirit of multi-layered architectures to obtain the final features. Finally, we demonstrate the effectiveness of our approach via several numerical studies

    Comparative genomics: multiple genome rearrangement and efficient algorithm development

    Get PDF
    Multiple genome rearrangement by signed reversal is discussed: For a collection of genomes represented by signed permutations, reconstruct their evolutionary history by using signed reversals, i.e. find a bifurcating tree where sampled genomes are assigned to leaf nodes and ancestral genomes (i.e. signed permutations) are hypothesized at internal nodes such that the total reversal distance summed over all edges of the tree is minimized. It is equivalent to finding an optimal Steiner tree that connects the given genomes by signed reversal paths. The key for the problem is to reconstruct all optimal Steiner nodes/ancestral genomes.;The problem is NP-hard and can only be solved by efficient approximation algorithms. Various algorithms/programs have been designed to solve the problem, such as BPAnalysis, GRAPPA, grid search algorithm, MGR greedy split algorithm (Chapter 1). However, they may have expensive computational costs or low inference accuracy. In this thesis, several new algorithms are developed, including nearest path search algorithm (Chapter 2), neighbor-perturbing algorithm (Chapter 3), branch-and-bound algorithm (Chapter 3), perturbing-improving algorithm (Chapter 4), partitioning algorithm (Chapter 5), etc. With theoretical proofs, computer simulations, and biological applications, these algorithms are shown to be 2-approximation algorithms and more efficient than the existing algorithms

    Light Euclidean Spanners with Steiner Points

    Get PDF
    The FOCS'19 paper of Le and Solomon, culminating a long line of research on Euclidean spanners, proves that the lightness (normalized weight) of the greedy (1+ϵ)(1+\epsilon)-spanner in Rd\mathbb{R}^d is O~(ϵd)\tilde{O}(\epsilon^{-d}) for any d=O(1)d = O(1) and any ϵ=Ω(n1d1)\epsilon = \Omega(n^{-\frac{1}{d-1}}) (where O~\tilde{O} hides polylogarithmic factors of 1ϵ\frac{1}{\epsilon}), and also shows the existence of point sets in Rd\mathbb{R}^d for which any (1+ϵ)(1+\epsilon)-spanner must have lightness Ω(ϵd)\Omega(\epsilon^{-d}). Given this tight bound on the lightness, a natural arising question is whether a better lightness bound can be achieved using Steiner points. Our first result is a construction of Steiner spanners in R2\mathbb{R}^2 with lightness O(ϵ1logΔ)O(\epsilon^{-1} \log \Delta), where Δ\Delta is the spread of the point set. In the regime of Δ21/ϵ\Delta \ll 2^{1/\epsilon}, this provides an improvement over the lightness bound of Le and Solomon [FOCS 2019]; this regime of parameters is of practical interest, as point sets arising in real-life applications (e.g., for various random distributions) have polynomially bounded spread, while in spanner applications ϵ\epsilon often controls the precision, and it sometimes needs to be much smaller than O(1/logn)O(1/\log n). Moreover, for spread polynomially bounded in 1/ϵ1/\epsilon, this upper bound provides a quadratic improvement over the non-Steiner bound of Le and Solomon [FOCS 2019], We then demonstrate that such a light spanner can be constructed in Oϵ(n)O_{\epsilon}(n) time for polynomially bounded spread, where OϵO_{\epsilon} hides a factor of poly(1ϵ)\mathrm{poly}(\frac{1}{\epsilon}). Finally, we extend the construction to higher dimensions, proving a lightness upper bound of O~(ϵ(d+1)/2+ϵ2logΔ)\tilde{O}(\epsilon^{-(d+1)/2} + \epsilon^{-2}\log \Delta) for any 3d=O(1)3\leq d = O(1) and any ϵ=Ω(n1d1)\epsilon = \Omega(n^{-\frac{1}{d-1}}).Comment: 23 pages, 2 figures, to appear in ESA 2

    NN-Steiner: A Mixed Neural-algorithmic Approach for the Rectilinear Steiner Minimum Tree Problem

    Full text link
    Recent years have witnessed rapid advances in the use of neural networks to solve combinatorial optimization problems. Nevertheless, designing the "right" neural model that can effectively handle a given optimization problem can be challenging, and often there is no theoretical understanding or justification of the resulting neural model. In this paper, we focus on the rectilinear Steiner minimum tree (RSMT) problem, which is of critical importance in IC layout design and as a result has attracted numerous heuristic approaches in the VLSI literature. Our contributions are two-fold. On the methodology front, we propose NN-Steiner, which is a novel mixed neural-algorithmic framework for computing RSMTs that leverages the celebrated PTAS algorithmic framework of Arora to solve this problem (and other geometric optimization problems). Our NN-Steiner replaces key algorithmic components within Arora's PTAS by suitable neural components. In particular, NN-Steiner only needs four neural network (NN) components that are called repeatedly within an algorithmic framework. Crucially, each of the four NN components is only of bounded size independent of input size, and thus easy to train. Furthermore, as the NN component is learning a generic algorithmic step, once learned, the resulting mixed neural-algorithmic framework generalizes to much larger instances not seen in training. Our NN-Steiner, to our best knowledge, is the first neural architecture of bounded size that has capacity to approximately solve RSMT (and variants). On the empirical front, we show how NN-Steiner can be implemented and demonstrate the effectiveness of our resulting approach, especially in terms of generalization, by comparing with state-of-the-art methods (both neural and non-neural based).Comment: This paper is the complete version with appendix of the paper accepted in AAAI'24 with the same titl
    corecore