4,081 research outputs found
Topology Discovery of Sparse Random Graphs With Few Participants
We consider the task of topology discovery of sparse random graphs using
end-to-end random measurements (e.g., delay) between a subset of nodes,
referred to as the participants. The rest of the nodes are hidden, and do not
provide any information for topology discovery. We consider topology discovery
under two routing models: (a) the participants exchange messages along the
shortest paths and obtain end-to-end measurements, and (b) additionally, the
participants exchange messages along the second shortest path. For scenario
(a), our proposed algorithm results in a sub-linear edit-distance guarantee
using a sub-linear number of uniformly selected participants. For scenario (b),
we obtain a much stronger result, and show that we can achieve consistent
reconstruction when a sub-linear number of uniformly selected nodes
participate. This implies that accurate discovery of sparse random graphs is
tractable using an extremely small number of participants. We finally obtain a
lower bound on the number of participants required by any algorithm to
reconstruct the original random graph up to a given edit distance. We also
demonstrate that while consistent discovery is tractable for sparse random
graphs using a small number of participants, in general, there are graphs which
cannot be discovered by any algorithm even with a significant number of
participants, and with the availability of end-to-end information along all the
paths between the participants.Comment: A shorter version appears in ACM SIGMETRICS 2011. This version is
scheduled to appear in J. on Random Structures and Algorithm
Common lizards break Dollo’s law of irreversibility: genome-wide phylogenomics support a single origin of viviparity and re-evolution of oviparity
Dollo’s law of irreversibility states that once a complex trait has been lost in evolution, it cannot be regained. It is thought that complex epistatic interactions and developmental constraints impede the re-emergence of such a trait. Oviparous reproduction (egg-laying) requires the formation of an eggshell and represents an example of such a complex trait. In reptiles, viviparity (live-bearing) has evolved repeatedly but it is highly disputed if oviparity has re-evolved. Here, using up to 194,358 SNP loci and 1,334,760 bp of sequence, we reconstruct the phylogeny of viviparous and oviparous lineages of common lizards and infer the evolutionary history of parity modes. Our phylogeny supports six main common lizard lineages that have been previously identified. We find strong statistical support for a topological arrangement that suggests a reversal to oviparity from viviparity. Our topology is consistent with highly differentiated chromosomal configurations between lineages, but disagrees with previous phylogenetic studies in some nodes. While we find high support for a reversal to oviparity, more genomic and developmental data are needed to robustly test this and assess the mechanism by which a reversal might have occurred
Phylogenetic mixtures: Concentration of measure in the large-tree limit
The reconstruction of phylogenies from DNA or protein sequences is a major
task of computational evolutionary biology. Common phenomena, notably
variations in mutation rates across genomes and incongruences between gene
lineage histories, often make it necessary to model molecular data as
originating from a mixture of phylogenies. Such mixed models play an
increasingly important role in practice. Using concentration of measure
techniques, we show that mixtures of large trees are typically identifiable. We
also derive sequence-length requirements for high-probability reconstruction.Comment: Published in at http://dx.doi.org/10.1214/11-AAP837 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Phylogenetic Codivergence Supports Coevolution of Mimetic Heliconius Butterflies
The unpalatable and warning-patterned butterflies _Heliconius erato_ and _Heliconius melpomene_ provide the best studied example of mutualistic Müllerian mimicry, thought – but rarely demonstrated – to promote coevolution. Some of the strongest available evidence for coevolution comes from phylogenetic codivergence, the parallel divergence of ecologically associated lineages. Early evolutionary reconstructions suggested codivergence between mimetic populations of _H. erato_ and _H. melpomene_, and this was initially hailed as the most striking known case of coevolution. However, subsequent molecular phylogenetic analyses found discrepancies in phylogenetic branching patterns and timing (topological and temporal incongruence) that argued against codivergence. We present the first explicit cophylogenetic test of codivergence between mimetic populations of _H. erato_ and _H. melpomene_, and re-examine the timing of these radiations. We find statistically significant topological congruence between multilocus coalescent population phylogenies of _H. erato_ and _H. melpomene_, supporting repeated codivergence of mimetic populations. Divergence time estimates, based on a Bayesian coalescent model, suggest that the evolutionary radiations of _H. erato_ and _H. melpomene_ occurred over the same time period, and are compatible with a series of temporally congruent codivergence events. This evidence supports a history of reciprocal coevolution between Müllerian co-mimics characterised by phylogenetic codivergence and parallel phenotypic change
Inference of population splits and mixtures from genome-wide allele frequency data
Many aspects of the historical relationships between populations in a species
are reflected in genetic data. Inferring these relationships from genetic data,
however, remains a challenging task. In this paper, we present a statistical
model for inferring the patterns of population splits and mixtures in multiple
populations. In this model, the sampled populations in a species are related to
their common ancestor through a graph of ancestral populations. Using
genome-wide allele frequency data and a Gaussian approximation to genetic
drift, we infer the structure of this graph. We applied this method to a set of
55 human populations and a set of 82 dog breeds and wild canids. In both
species, we show that a simple bifurcating tree does not fully describe the
data; in contrast, we infer many migration events. While some of the migration
events that we find have been detected previously, many have not. For example,
in the human data we infer that Cambodians trace approximately 16% of their
ancestry to a population ancestral to other extant East Asian populations. In
the dog data, we infer that both the boxer and basenji trace a considerable
fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to
domestication, and that East Asian toy breeds (the Shih Tzu and the Pekingese)
result from admixture between modern toy breeds and "ancient" Asian breeds.
Software implementing the model described here, called TreeMix, is available at
http://treemix.googlecode.comComment: 28 pages, 6 figures in main text. Attached supplement is 22 pages, 15
figures. This is an updated version of the preprint available at
http://precedings.nature.com/documents/6956/version/
Uncertainty in phylogenetic tree estimates
Estimating phylogenetic trees is an important problem in evolutionary
biology, environmental policy and medicine. Although trees are estimated, their
uncertainties are discarded by mathematicians working in tree space. Here we
explicitly model the multivariate uncertainty of tree estimates. We consider
both the cases where uncertainty information arises extrinsically (through
covariate information) and intrinsically (through the tree estimates
themselves). The importance of accounting for tree uncertainty in tree space is
demonstrated in two case studies. In the first instance, differences between
gene trees are small relative to their uncertainties, while in the second, the
differences are relatively large. Our main goal is visualization of tree
uncertainty, and we demonstrate advantages of our method with respect to
reproducibility, speed and preservation of topological differences compared to
visualization based on multidimensional scaling. The proposal highlights that
phylogenetic trees are estimated in an extremely high-dimensional space,
resulting in uncertainty information that cannot be discarded. Most
importantly, it is a method that allows biologists to diagnose whether
differences between gene trees are biologically meaningful, or due to
uncertainty in estimation.Comment: Final version accepted to Journal of Computational and Graphical
Statistic
- …