4,779 research outputs found

    Degenerating families of dendrograms

    Full text link
    Dendrograms used in data analysis are ultrametric spaces, hence objects of nonarchimedean geometry. It is known that there exist pp-adic representation of dendrograms. Completed by a point at infinity, they can be viewed as subtrees of the Bruhat-Tits tree associated to the pp-adic projective line. The implications are that certain moduli spaces known in algebraic geometry are pp-adic parameter spaces of (families of) dendrograms, and stochastic classification can also be handled within this framework. At the end, we calculate the topology of the hidden part of a dendrogram.Comment: 13 pages, 8 figure

    Mumford dendrograms and discrete p-adic symmetries

    Full text link
    In this article, we present an effective encoding of dendrograms by embedding them into the Bruhat-Tits trees associated to pp-adic number fields. As an application, we show how strings over a finite alphabet can be encoded in cyclotomic extensions of Qp\mathbb{Q}_p and discuss pp-adic DNA encoding. The application leads to fast pp-adic agglomerative hierarchic algorithms similar to the ones recently used e.g. by A. Khrennikov and others. From the viewpoint of pp-adic geometry, to encode a dendrogram XX in a pp-adic field KK means to fix a set SS of KK-rational punctures on the pp-adic projective line P1\mathbb{P}^1. To P1∖S\mathbb{P}^1\setminus S is associated in a natural way a subtree inside the Bruhat-Tits tree which recovers XX, a method first used by F. Kato in 1999 in the classification of discrete subgroups of PGL2(K)\textrm{PGL}_2(K). Next, we show how the pp-adic moduli space M0,n\mathfrak{M}_{0,n} of P1\mathbb{P}^1 with nn punctures can be applied to the study of time series of dendrograms and those symmetries arising from hyperbolic actions on P1\mathbb{P}^1. In this way, we can associate to certain classes of dynamical systems a Mumford curve, i.e. a pp-adic algebraic curve with totally degenerate reduction modulo pp. Finally, we indicate some of our results in the study of general discrete actions on P1\mathbb{P}^1, and their relation to pp-adic Hurwitz spaces.Comment: 14 pages, 6 figure

    Interactive visualisation and exploration of biological data

    Get PDF
    International audienceno abstrac

    Validation of purdue engineering shape benchmark clusters by crowdsourcing

    Get PDF
    The effective organization of CAD data archives is central to PLM and consequently content based retrieval of 2D drawings and 3D models is often seen as a "holy grail" for the industry. Given this context, it is not surprising that the vision of a "Google for shape", which enables engineers to search databases of 3D models for components similar in shape to a query part, has motivated numerous researchers to investigate algorithms for computing geometric similarity. Measuring the effectiveness of the many approaches proposed has in turn lead to the creation of benchmark datasets against which researchers can compare the performance of their search engines. However to be useful the datasets used to measure the effectiveness of 3D retrieval algorithms must not only define a collection of models, but also provide a canonical specification of their relative similarity. Because the objective of shape retrieval algorithms is (typically) to retrieve groups of objects that humans perceive as "similar" these benchmark similarity relationships have (by definition) to be manually determined through inspection

    Detecting similarities among distant homologous proteins by comparison of domain flexibilities

    Get PDF
    Aim of this work is to assess the informativeness of protein dynamics in the detection of similarities among distant homologous proteins. To this end, an approach to perform large-scale comparisons of protein domain flexibilities is proposed. CONCOORD is confirmed as a reliable method for fast conformational sampling. The root mean square fluctuation of alpha carbon positions in the essential dynamics subspace is employed as a measure of local flexibility and a synthetic index of similarity is presented. The dynamics of a large collection of protein domains from ASTRAL/SCOP40 is analyzed and the possibility to identify relationships, at both the family and the superfamily levels, on the basis of the dynamical features is discussed. The obtained picture is in agreement with the SCOP classification, and furthermore suggests the presence of a distinguishable familiar trend in the flexibility profiles. The results support the complementarity of the dynamical and the structural information, suggesting that information from dynamics analysis can arise from functional similarities, often partially hidden by a static comparison. On the basis of this first test, flexibility annotation can be expected to help in automatically detecting functional similarities otherwise unrecoverable. © 2007 The Author(s)

    Axiomatic Construction of Hierarchical Clustering in Asymmetric Networks

    Full text link
    This paper considers networks where relationships between nodes are represented by directed dissimilarities. The goal is to study methods for the determination of hierarchical clusters, i.e., a family of nested partitions indexed by a connectivity parameter, induced by the given dissimilarity structures. Our construction of hierarchical clustering methods is based on defining admissible methods to be those methods that abide by the axioms of value - nodes in a network with two nodes are clustered together at the maximum of the two dissimilarities between them - and transformation - when dissimilarities are reduced, the network may become more clustered but not less. Several admissible methods are constructed and two particular methods, termed reciprocal and nonreciprocal clustering, are shown to provide upper and lower bounds in the space of admissible methods. Alternative clustering methodologies and axioms are further considered. Allowing the outcome of hierarchical clustering to be asymmetric, so that it matches the asymmetry of the original data, leads to the inception of quasi-clustering methods. The existence of a unique quasi-clustering method is shown. Allowing clustering in a two-node network to proceed at the minimum of the two dissimilarities generates an alternative axiomatic construction. There is a unique clustering method in this case too. The paper also develops algorithms for the computation of hierarchical clusters using matrix powers on a min-max dioid algebra and studies the stability of the methods proposed. We proved that most of the methods introduced in this paper are such that similar networks yield similar hierarchical clustering results. Algorithms are exemplified through their application to networks describing internal migration within states of the United States (U.S.) and the interrelation between sectors of the U.S. economy.Comment: This is a largely extended version of the previous conference submission under the same title. The current version contains the material in the previous version (published in ICASSP 2013) as well as material presented at the Asilomar Conference on Signal, Systems, and Computers 2013, GlobalSIP 2013, and ICML 2014. Also, unpublished material is included in the current versio
    • 

    corecore