482 research outputs found
A functional limit theorem for the profile of -ary trees
In this paper we prove a functional limit theorem for the weighted profile of
a -ary tree. For the proof we use classical martingales connected to
branching Markov processes and a generalized version of the profile-polynomial
martingale. By embedding, choosing weights and a branch factor in a right way,
we finally rediscover the profiles of some well-known discrete time trees.Comment: Published in at http://dx.doi.org/10.1214/09-AAP640 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Genetic and morphological studies of Trichosirocalus species introduced to North America, Australia and New Zealand for the biological control of thistles
Trichosirocalus horridus sensu lato has been used as a biological control agent of several invasive thistles (Carduus spp., Cirsium spp. and Onopordum spp.) since 1974. It has been recognized as a single species until 2002, when it was split into three species based on morphological characters: T. horridus, Trichosirocalus briesei and Trichosirocalus mortadelo, each purported to have different host plants. Because of this taxonomic change, uncertainty exists as to which species were released in various countries; furthermore, there appears to be some exceptions to the purported host plants of some of these species. To resolve these questions, we conducted an integrative taxonomic study of the T. horridus species complex using molecular genetic and morphological analyses of specimens from three continents. Both mitochondrial cytochrome c oxidase subunit I and nuclear elongation factor 1α markers clearly indicate that there are only two distinct species, T. horridus and T. briesei. Molecular evidence, morphological analysis and host plant associations support the synonymy of T. horridus (Panzer, 1801) and T. mortadelo Alonso-Zarazaga & Sánchez-Ruiz, 2002. We determine that T. horridus has been established in Canada, USA, New Zealand and Australia and that T. briesei is established in Australia. The former species was collected from Carduus, Cirsium and Onopordum spp. in the field, whereas the latter appears to be specific to Onopordum
Reconstructing (super)trees from data sets with missing distances: Not all is lost
The wealth of phylogenetic information accumulated over many decades of biological research, coupled with recent technological advances in molecular sequence generation, present significant opportunities for researchers to investigate relationships across and within the kingdoms of life. However, to make best use of this data wealth, several problems must first be overcome. One key problem is finding effective strategies to deal with missing data. Here, we introduce Lasso, a novel heuristic approach for reconstructing rooted phylogenetic trees from distance matrices with missing values, for datasets where a molecular clock may be assumed. Contrary to other phylogenetic methods on partial datasets, Lasso possesses desirable properties such as its reconstructed trees being both unique and edge-weighted. These properties are achieved by Lasso restricting its leaf set to a large subset of all possible taxa, which in many practical situations is the entire taxa set. Furthermore, the Lasso approach is distance-based, rendering it very fast to run and suitable for datasets of all sizes, including large datasets such as those generated by modern Next Generation Sequencing technologies. To better understand the performance of Lasso, we assessed it by means of artificial and real biological datasets, showing its effectiveness in the presence of missing data. Furthermore, by formulating the supermatrix problem as a particular case of the missing data problem, we assessed Lasso's ability to reconstruct supertrees. We demonstrate that, although not specifically designed for such a purpose, Lasso performs better than or comparably with five leading supertree algorithms on a challenging biological data set. Finally, we make freely available a software implementation of Lasso so that researchers may, for the first time, perform both rooted tree and supertree reconstruction with branch lengths on their own partial datasets
Percolation-like Scaling Exponents for Minimal Paths and Trees in the Stochastic Mean Field Model
In the mean field (or random link) model there are points and inter-point
distances are independent random variables. For and in the
limit, let (maximum number of steps
in a path whose average step-length is ). The function
is analogous to the percolation function in percolation theory:
there is a critical value at which becomes
non-zero, and (presumably) a scaling exponent in the sense
. Recently developed probabilistic
methodology (in some sense a rephrasing of the cavity method of Mezard-Parisi)
provides a simple albeit non-rigorous way of writing down such functions in
terms of solutions of fixed-point equations for probability distributions.
Solving numerically gives convincing evidence that . A parallel
study with trees instead of paths gives scaling exponent . The new
exponents coincide with those found in a different context (comparing optimal
and near-optimal solutions of mean-field TSP and MST) and reinforce the
suggestion that these scaling exponents determine universality classes for
optimization problems on random points.Comment: 19 page
Twitter event networks and the Superstar model
Condensation phenomenon is often observed in social networks such as Twitter
where one "superstar" vertex gains a positive fraction of the edges, while the
remaining empirical degree distribution still exhibits a power law tail. We
formulate a mathematically tractable model for this phenomenon that provides a
better fit to empirical data than the standard preferential attachment model
across an array of networks observed in Twitter. Using embeddings in an
equivalent continuous time version of the process, and adapting techniques from
the stable age-distribution theory of branching processes, we prove limit
results for the proportion of edges that condense around the superstar, the
degree distribution of the remaining vertices, maximal nonsuperstar degree
asymptotics and height of these random trees in the large network limit.Comment: Published at http://dx.doi.org/10.1214/14-AAP1053 in the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Asymptotic genealogy of a critical branching process
Consider a continuous-time binary branching process conditioned to have
population size n at some time t, and with a chance p for recording each
extinct individual in the process. Within the family tree of this process, we
consider the smallest subtree containing the genealogy of the extant
individuals together with the genealogy of the recorded extinct individuals. We
introduce a novel representation of such subtrees in terms of a point-process,
and provide asymptotic results on the distribution of this point-process as the
number of extant individuals increases. We motivate the study within the scope
of a coherent analysis for an a priori model for macroevolution.Comment: 30 page
The height of random -trees and related branching processes
We consider the height of random k-trees and k-Apollonian networks. These
random graphs are not really trees, but instead have a tree-like structure. The
height will be the maximum distance of a vertex from the root. We show that
w.h.p. the height of random k-trees and k-Apollonian networks is asymptotic to
clog t, where t is the number of vertices, and c=c(k) is given as the solution
to a transcendental equation. The equations are slightly different for the two
types of process. In the limit as k-->oo the height of both processes is
asymptotic to log t/(k log 2)
- …