486 research outputs found
Comparison of articulate brachiopod nuclear and mitochondrial gene trees leads to a clade-based redefinition of protostomes (Protostomozoa) and deuterostomes (Deuterostomozoa)
Nuclear and mtDNA sequences from selected short-looped terebratuloid (terebratulacean) articulate brachiopods yield congruent and genetically independent phylogenetic reconstructions by parsimony, neighbor-joining and maximum likelihood methods, suggesting that both sources of data are reliable guides to brachiopod species phylogeny. The present-day genealogical relationships and geographical distributions of the tested terebratuloid brachiopods are consistent with a tethyan dispersal and subsequent radiation. Concordance of nuclear and mitochondrial gene phylogenies reinforces previous indications that articulate brachiopods, inarticulate brachiopods, phoronids and ectoprocts cluster with other organisms generally regarded as protostomes. Since ontogeny and morphology in brachiopods, ectoprocts and phoronids depart in important respects from those features supposedly diagnostic of protostomes, this demonstrates that the operational definition of protostomy by the usual ontological characters must be misleading or unreliable. New, molecular, operational definitions are proposed to replace the traditional criteria for the recognition of protostomes and deuterostomes, and the clade-based terms 'Protostomozoa' and 'Deuterostomozoa' are proposed to replace the existing terms 'Protostomia' and 'Deuterostomia'
Sequence alignment, mutual information, and dissimilarity measures for constructing phylogenies
Existing sequence alignment algorithms use heuristic scoring schemes which
cannot be used as objective distance metrics. Therefore one relies on measures
like the p- or log-det distances, or makes explicit, and often simplistic,
assumptions about sequence evolution. Information theory provides an
alternative, in the form of mutual information (MI) which is, in principle, an
objective and model independent similarity measure. MI can be estimated by
concatenating and zipping sequences, yielding thereby the "normalized
compression distance". So far this has produced promising results, but with
uncontrolled errors. We describe a simple approach to get robust estimates of
MI from global pairwise alignments. Using standard alignment algorithms, this
gives for animal mitochondrial DNA estimates that are strikingly close to
estimates obtained from the alignment free methods mentioned above. Our main
result uses algorithmic (Kolmogorov) information theory, but we show that
similar results can also be obtained from Shannon theory. Due to the fact that
it is not additive, normalized compression distance is not an optimal metric
for phylogenetics, but we propose a simple modification that overcomes the
issue of additivity. We test several versions of our MI based distance measures
on a large number of randomly chosen quartets and demonstrate that they all
perform better than traditional measures like the Kimura or log-det (resp.
paralinear) distances. Even a simplified version based on single letter Shannon
entropies, which can be easily incorporated in existing software packages, gave
superior results throughout the entire animal kingdom. But we see the main
virtue of our approach in a more general way. For example, it can also help to
judge the relative merits of different alignment algorithms, by estimating the
significance of specific alignments.Comment: 19 pages + 16 pages of supplementary materia
The Qphyl System: a web-based interactive system for phylogenetic analysis
Phylogenetic tree reconstruction is a prominent problem in computational biology. Currently, all computational methods have their limitations and work well only for simple problems of small size. No existing method can guarantee that trees constructed for real-world problems are true phylogenetic trees for large and complex problems mainly because the existing computational models are not very biologically realistic. It has become a serious issue for many important real-life applications which often desire accurate results from phylogenetic analysis. Thus, it is very crucial to effectively incorporate multi-disciplinary analyses and synthesize results from various sources when answering real-life questions. In this thesis, a novel web-based phylogeny reconstruction system with a real-time interactive environment, called Qphyl (short for quartet-based phylogenetic analysis) is introduced. The Qphyl system uses a new interactive approach to enable biologists to greatly improve the final results through effectively dynamic interaction with the computation, e.g., to move the computation back and forth to different stages so users can check the intermediate results, compare results from different methods and carry out certain manual refinements using their biological domain-specific knowledge in the decision making on how a tree should be reconstructed. Currently the alpha version of this web-based interactive system has been released and accessible through the URL: http://ww-test.it.usyd.edu.au/sogrid/qphyl/
Comparative Performance of Supertree Algorithms in Large Data Sets Using the Soapberry Family (Sapindaceae) as a Case Study
For the last 2 decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially with regard to the supermatrix approach that is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical data set (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and computational time required by the algorithm. Additional analyses were also conducted on a reduced data set to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the matrix representation with parsimony (MRP), MinFlip, and MinCut methods performed well according to our criteria, whereas the average consensus, split fit, and most similar supertree methods showed a poorer performance or at least did not behave the same way as the total evidence tree. Results for the super distance matrix, that is, the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip, and MinCut. The output of each method was only slightly improved when applied to the reduced data set, suggesting a correct behavior of the heuristic searches and a relatively low sensitivity of the algorithms to data set sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardized heuristic search for all methods and the increase in computing power to handle large data sets. The latter would prove to be particularly useful for promising approaches such as the maximum quartet fit method that yet requires substantial computing powe
Molecular phylogeny of brachiopods and phoronids based on nuclear-encoded small subunit ribosomal RNA gene sequences
Brachiopod and phoronid phylogeny is inferred from SSU rDNA sequences of 28 articulate and nine inarticulate brachiopods, three phoronids, two ectoprocts and various outgroups, using gene trees reconstructed by weighted parsimony, distance and maximum likelihood methods. Of these sequences, 33 from brachiopods, two from phoronids and one each from an ectoproct and a priapulan are newly determined. The brachiopod sequences belong to 31 different genera and thus survey about 10% of extant genus-level diversity. Sequences determined in different laboratories and those from closely related taxa agree well, but evidence is presented suggesting that one published phoronid sequence (GenBank accession UO12648) is a brachiopod-phoronid chimaera, and this sequence is excluded from the analyses. The chiton, Acanthopleura, is identified as the phenetically proximal outgroup; other selected outgroups were chosen to allow comparison with recent, non-molecular analyses of brachiopod phylogeny. The different outgroups and methods of phylogenetic reconstruction lead to similar results, with differences mainly in the resolution of weakly supported ancient and recent nodes, including the divergence of inarticulate brachiopod sub-phyla, the position of the rhynchonellids in relation to long- and short-looped articulate brachiopod clades and the relationships of some articulate brachiopod genera and species. Attention is drawn to the problem presented by nodes that are strongly supported by non-molecular evidence but receive only low bootstrap resampling support. Overall, the gene trees agree with morphology-based brachiopod taxonomy, but novel relationships are tentatively suggested for thecideidine and megathyrid brachiopods. Articulate brachiopods are found to be monophyletic in all reconstructions, but monophyly of inarticulate brachiopods and the possible inclusion of phoronids in the inarticulate brachiopod clade are less strongly established. Phoronids are clearly excluded from a sister-group relationship with articulate brachiopods, this proposed relationship being due to the rejected, chimaeric sequence (GenBank UO12648). Lineage relative rate tests show no heterogeneity of evolutionary rate among articulate brachiopod sequences, but indicate that inarticulate brachiopod plus phoronid sequences evolve somewhat more slowly. Both brachiopods and phoronids evolve slowly by comparison with other invertebrates. A number of palaeontologically dated times of earliest appearance are used to make upper and lower estimates of the global rate of brachiopod SSU rDNA evolution, and these estimates are used to infer the likely divergence times of other nodes in the gene tree. There is reasonable agreement between most inferred molecular and palaeontological ages. The estimated rates of SSU rDNA sequence evolution suggest that the last common ancestor of brachiopods, chitons and other protostome invertebrates (Lophotrochozoa and Ecdysozoa) lived deep in Precambrian time. Results of this first DNA-based, taxonomically representative analysis of brachiopod phylogeny are in broad agreement with current morphology-based classification and systematics and are largely consistent with the hypothesis that brachiopod shell ontogeny and morphology are a good guide to phylogeny
Four myriapod relatives – but who are sisters? No end to debates on relationships among the four major myriapod subgroups
BackgroundPhylogenetic relationships among the myriapod subgroups Chilopoda, Diplopoda, Symphyla and Pauropoda are still not robustly resolved. The first phylogenomic study covering all subgroups resolved phylogenetic relationships congruently to morphological evidence but is in conflict with most previously published phylogenetic trees based on diverse molecular data. Outgroup choice and long-branch attraction effects were stated as possible explanations for these incongruencies. In this study, we addressed these issues by extending the myriapod and outgroup taxon sampling using transcriptome data.ResultsWe generated new transcriptome data of 42 panarthropod species, including all four myriapod subgroups and additional outgroup taxa. Our taxon sampling was complemented by published transcriptome and genome data resulting in a supermatrix covering 59 species. We compiled two data sets, the first with a full coverage of genes per species (292 single-copy protein-coding genes), the second with a less stringent coverage (988 genes). We inferred phylogenetic relationships among myriapods using different data types, tree inference, and quartet computation approaches. Our results unambiguously support monophyletic Mandibulata and Myriapoda. Our analyses clearly showed that there is strong signal for a single unrooted topology, but a sensitivity of the position of the internal root on the choice of outgroups. However, we observe strong evidence for a clade Pauropoda+Symphyla, as well as for a clade Chilopoda+Diplopoda.ConclusionsOur best quartet topology is incongruent with current morphological phylogenies which were supported in another phylogenomic study. AU tests and quartet mapping reject the quartet topology congruent to trees inferred with morphological characters. Moreover, quartet mapping shows that confounding signal present in the data set is sufficient to explain the weak signal for the quartet topology derived from morphological characters. Although outgroup choice affects results, our study could narrow possible trees to derivatives of a single quartet topology. For highly disputed relationships, we propose to apply a series of tests (AU and quartet mapping), since results of such tests allow to narrow down possible relationships and to rule out confounding signal
Developing and applying supertree methods in Phylogenomics and Macroevolution
Supertrees
can
be
used
to
combine
partially
overalapping
trees
and
generate
more
inclusive
phylogenies.
It
has
been
proposed
that
Maximum
Likelihood
(ML)
supertrees
method
(SM)
could
be
developed
using
an
exponential
probability
distribution
to
model
errors
in
the
input
trees
(given
a
proposed
supertree).
When
the
tree-‐to-‐tree
distances
used
in
the
ML
computation
are
symmetric
differences,
the
ML
SM
has
been
shown
to
be
equivalent
to
a
Majority-‐Rule
consensus
SM,
and
hence,
exactly
as
the
latter,
it
has
the
desirable
property
of
being
a
median
tree
(with
reference
to
the
set
of
input
trees).
The
ability
to
estimate
the
likelihood
of
supertrees,
allows
implementing
Bayesian
(MCMC)
approaches,
which
have
the
advantage
to
allow
the
support
for
the
clades
in
a
supertree
to
be
properly
estimated.
I
present
here
the
L.U.St
software
package;
it
contains
the
first
implementation
of
a
ML
SM
and
allows
for
the
first
time
statistical
tests
on
supertrees.
I
also
characterized
the
first
implementation
of
the
Bayesian
(MCMC)
SM.
Both
the
ML
and
the
Bayesian
(MCMC)
SMs
have
been
tested
for
and
found
to
be
immune
to
biases.
The
Bayesian
(MCMC)
SM
is
applied
to
the
reanalyses
of
a
variety
of
datasets
(i.e.
the
datasets
for
the
Metazoa
and
the
Carnivora),
and
I
have
also
recovered
the
first
Bayesian
supertree-‐based
phylogeny
of
the
Eubacteria
and
the
Archaebacteria.
These
new
SMs
are
discussed,
with
reference
to
other,
well-‐
known
SMs
like
Matrix
Representation
with
Parsimony.
Both
the
ML
and
Bayesian
SM
offer
multiple
attractive
advantages
over
current
alternatives
- …