482 research outputs found

    A functional limit theorem for the profile of bb-ary trees

    Full text link
    In this paper we prove a functional limit theorem for the weighted profile of a bb-ary tree. For the proof we use classical martingales connected to branching Markov processes and a generalized version of the profile-polynomial martingale. By embedding, choosing weights and a branch factor in a right way, we finally rediscover the profiles of some well-known discrete time trees.Comment: Published in at http://dx.doi.org/10.1214/09-AAP640 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Genetic and morphological studies of Trichosirocalus species introduced to North America, Australia and New Zealand for the biological control of thistles

    Get PDF
    Trichosirocalus horridus sensu lato has been used as a biological control agent of several invasive thistles (Carduus spp., Cirsium spp. and Onopordum spp.) since 1974. It has been recognized as a single species until 2002, when it was split into three species based on morphological characters: T. horridus, Trichosirocalus briesei and Trichosirocalus mortadelo, each purported to have different host plants. Because of this taxonomic change, uncertainty exists as to which species were released in various countries; furthermore, there appears to be some exceptions to the purported host plants of some of these species. To resolve these questions, we conducted an integrative taxonomic study of the T. horridus species complex using molecular genetic and morphological analyses of specimens from three continents. Both mitochondrial cytochrome c oxidase subunit I and nuclear elongation factor 1α markers clearly indicate that there are only two distinct species, T. horridus and T. briesei. Molecular evidence, morphological analysis and host plant associations support the synonymy of T. horridus (Panzer, 1801) and T. mortadelo Alonso-Zarazaga & Sánchez-Ruiz, 2002. We determine that T. horridus has been established in Canada, USA, New Zealand and Australia and that T. briesei is established in Australia. The former species was collected from Carduus, Cirsium and Onopordum spp. in the field, whereas the latter appears to be specific to Onopordum

    Reconstructing (super)trees from data sets with missing distances: Not all is lost

    Get PDF
    The wealth of phylogenetic information accumulated over many decades of biological research, coupled with recent technological advances in molecular sequence generation, present significant opportunities for researchers to investigate relationships across and within the kingdoms of life. However, to make best use of this data wealth, several problems must first be overcome. One key problem is finding effective strategies to deal with missing data. Here, we introduce Lasso, a novel heuristic approach for reconstructing rooted phylogenetic trees from distance matrices with missing values, for datasets where a molecular clock may be assumed. Contrary to other phylogenetic methods on partial datasets, Lasso possesses desirable properties such as its reconstructed trees being both unique and edge-weighted. These properties are achieved by Lasso restricting its leaf set to a large subset of all possible taxa, which in many practical situations is the entire taxa set. Furthermore, the Lasso approach is distance-based, rendering it very fast to run and suitable for datasets of all sizes, including large datasets such as those generated by modern Next Generation Sequencing technologies. To better understand the performance of Lasso, we assessed it by means of artificial and real biological datasets, showing its effectiveness in the presence of missing data. Furthermore, by formulating the supermatrix problem as a particular case of the missing data problem, we assessed Lasso's ability to reconstruct supertrees. We demonstrate that, although not specifically designed for such a purpose, Lasso performs better than or comparably with five leading supertree algorithms on a challenging biological data set. Finally, we make freely available a software implementation of Lasso so that researchers may, for the first time, perform both rooted tree and supertree reconstruction with branch lengths on their own partial datasets

    Percolation-like Scaling Exponents for Minimal Paths and Trees in the Stochastic Mean Field Model

    Full text link
    In the mean field (or random link) model there are nn points and inter-point distances are independent random variables. For 0<<0 < \ell < \infty and in the nn \to \infty limit, let δ()=1/n×\delta(\ell) = 1/n \times (maximum number of steps in a path whose average step-length is \leq \ell). The function δ()\delta(\ell) is analogous to the percolation function in percolation theory: there is a critical value =e1\ell_* = e^{-1} at which δ()\delta(\cdot) becomes non-zero, and (presumably) a scaling exponent β\beta in the sense δ()()β\delta(\ell) \asymp (\ell - \ell_*)^\beta. Recently developed probabilistic methodology (in some sense a rephrasing of the cavity method of Mezard-Parisi) provides a simple albeit non-rigorous way of writing down such functions in terms of solutions of fixed-point equations for probability distributions. Solving numerically gives convincing evidence that β=3\beta = 3. A parallel study with trees instead of paths gives scaling exponent β=2\beta = 2. The new exponents coincide with those found in a different context (comparing optimal and near-optimal solutions of mean-field TSP and MST) and reinforce the suggestion that these scaling exponents determine universality classes for optimization problems on random points.Comment: 19 page

    Twitter event networks and the Superstar model

    Get PDF
    Condensation phenomenon is often observed in social networks such as Twitter where one "superstar" vertex gains a positive fraction of the edges, while the remaining empirical degree distribution still exhibits a power law tail. We formulate a mathematically tractable model for this phenomenon that provides a better fit to empirical data than the standard preferential attachment model across an array of networks observed in Twitter. Using embeddings in an equivalent continuous time version of the process, and adapting techniques from the stable age-distribution theory of branching processes, we prove limit results for the proportion of edges that condense around the superstar, the degree distribution of the remaining vertices, maximal nonsuperstar degree asymptotics and height of these random trees in the large network limit.Comment: Published at http://dx.doi.org/10.1214/14-AAP1053 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Asymptotic genealogy of a critical branching process

    Full text link
    Consider a continuous-time binary branching process conditioned to have population size n at some time t, and with a chance p for recording each extinct individual in the process. Within the family tree of this process, we consider the smallest subtree containing the genealogy of the extant individuals together with the genealogy of the recorded extinct individuals. We introduce a novel representation of such subtrees in terms of a point-process, and provide asymptotic results on the distribution of this point-process as the number of extant individuals increases. We motivate the study within the scope of a coherent analysis for an a priori model for macroevolution.Comment: 30 page

    The height of random kk-trees and related branching processes

    Full text link
    We consider the height of random k-trees and k-Apollonian networks. These random graphs are not really trees, but instead have a tree-like structure. The height will be the maximum distance of a vertex from the root. We show that w.h.p. the height of random k-trees and k-Apollonian networks is asymptotic to clog t, where t is the number of vertices, and c=c(k) is given as the solution to a transcendental equation. The equations are slightly different for the two types of process. In the limit as k-->oo the height of both processes is asymptotic to log t/(k log 2)
    corecore