1,969 research outputs found

    Methods for fast and reliable clustering

    Get PDF

    Noise-Stable Rigid Graphs for Euclidean Embedding

    Full text link
    We proposed a new criterion \textit{noise-stability}, which revised the classical rigidity theory, for evaluation of MDS algorithms which can truthfully represent the fidelity of global structure reconstruction; then we proved the noise-stability of the cMDS algorithm in generic conditions, which provides a rigorous theoretical guarantee for the precision and theoretical bounds for Euclidean embedding and its application in fields including wireless sensor network localization and satellite positioning. Furthermore, we looked into previous work about minimum-cost globally rigid spanning subgraph, and proposed an algorithm to construct a minimum-cost noise-stable spanning graph in the Euclidean space, which enabled reliable localization on sparse graphs of noisy distance constraints with linear numbers of edges and sublinear costs in total edge lengths. Additionally, this algorithm also suggests a scheme to reconstruct point clouds from pairwise distances at a minimum of O(n)O(n) time complexity, down from O(n3)O(n^3) for cMDS

    Gap Processing for Adaptive Maximal Poisson-Disk Sampling

    Full text link
    In this paper, we study the generation of maximal Poisson-disk sets with varying radii. First, we present a geometric analysis of gaps in such disk sets. This analysis is the basis for maximal and adaptive sampling in Euclidean space and on manifolds. Second, we propose efficient algorithms and data structures to detect gaps and update gaps when disks are inserted, deleted, moved, or have their radius changed. We build on the concepts of the regular triangulation and the power diagram. Third, we will show how our analysis can make a contribution to the state-of-the-art in surface remeshing.Comment: 16 pages. ACM Transactions on Graphics, 201

    Minimizing the average distance to a closest leaf in a phylogenetic tree

    Full text link
    When performing an analysis on a collection of molecular sequences, it can be convenient to reduce the number of sequences under consideration while maintaining some characteristic of a larger collection of sequences. For example, one may wish to select a subset of high-quality sequences that represent the diversity of a larger collection of sequences. One may also wish to specialize a large database of characterized "reference sequences" to a smaller subset that is as close as possible on average to a collection of "query sequences" of interest. Such a representative subset can be useful whenever one wishes to find a set of reference sequences that is appropriate to use for comparative analysis of environmentally-derived sequences, such as for selecting "reference tree" sequences for phylogenetic placement of metagenomic reads. In this paper we formalize these problems in terms of the minimization of the Average Distance to the Closest Leaf (ADCL) and investigate algorithms to perform the relevant minimization. We show that the greedy algorithm is not effective, show that a variant of the Partitioning Among Medoids (PAM) heuristic gets stuck in local minima, and develop an exact dynamic programming approach. Using this exact program we note that the performance of PAM appears to be good for simulated trees, and is faster than the exact algorithm for small trees. On the other hand, the exact program gives solutions for all numbers of leaves less than or equal to the given desired number of leaves, while PAM only gives a solution for the pre-specified number of leaves. Via application to real data, we show that the ADCL criterion chooses chimeric sequences less often than random subsets, while the maximization of phylogenetic diversity chooses them more often than random. These algorithms have been implemented in publicly available software.Comment: Please contact us with any comments or questions

    Machine-learned interatomic potentials for the syngas conversion on Rhodium

    Get PDF
    The kinetics and thermodynamics of chemical processes such as heterogeneous catalytic reactions often depend on tremendously complex reaction networks, whose exploration quickly exceeds computational possibilities. The usage of first principle methods to identify and calculate the relevant reaction steps therefore becomes unfeasible, requiring new methods to overcome these challenges. Over the last decade, different machine-learning methods have been developed and applied to chemical problems. These methods range from neural networks to kernel-based methods such as kernel ridge regression or the training of Gaussian approximation potentials (GAPs), which is the machine-learning method used in this work. Besides the usage of machine-learning to overcome computational barriers, another aspect in the handling of complex reaction networks is their reduction to the most important reaction steps and intermediates. Prerequisite for the reduction of network complexity is the knowledge of the appropriate energy landscape. Finding the global minimum of a chemical system can give deep insights into the relevant conformations for each involved structure bridging the gap to build up the energetic environment of a catalytic reaction. Therefore, in this work a method is developed to pool forces of both machine-learning and a distinct approach to find the global minima of the involved adsorbates in the syngas conversion on catalytic rhodium surfaces. As part of the work, an iterative training workflow for the training of a GAP is developed. Using this workflow, a system-specific potential is trained for the syngas conversion on Rhodium surfaces. The developed potential is then applied to the global optimization of the involved educts, intermediates and products emerging in this specific system - the syngas conversion on Rhodium
    corecore