25 research outputs found
Eumalacostracan phylogeny and total evidence: limitations of the usual suspects
<p>Abstract</p> <p>Background</p> <p>The phylogeny of Eumalacostraca (Crustacea) remains elusive, despite over a century of interest. Recent morphological and molecular phylogenies appear highly incongruent, but this has not been assessed quantitatively. Moreover, 18S rRNA trees show striking branch length differences between species, accompanied by a conspicuous clustering of taxa with similar branch lengths. Surprisingly, previous research found no rate heterogeneity. Hitherto, no phylogenetic analysis of all major eumalacostracan taxa (orders) has either combined evidence from multiple loci, or combined molecular and morphological evidence.</p> <p>Results</p> <p>We combined evidence from four nuclear ribosomal and mitochondrial loci (18S rRNA, 28S rRNA, 16S rRNA, and cytochrome <it>c </it>oxidase subunit I) with a newly synthesized morphological dataset. We tested the homogeneity of data partitions, both in terms of character congruence and the topological congruence of inferred trees. We also performed Bayesian and parsimony analyses on separate and combined partitions, and tested the contribution of each partition. We tested for potential long-branch attraction (LBA) using taxon deletion experiments, and with relative rate tests. Additionally we searched for molecular polytomies (spurious clades). Lastly, we investigated the phylogenetic stability of taxa, and assessed their impact on inferred relationships over the whole tree. We detected significant conflict between data partitions, especially between morphology and molecules. We found significant rate heterogeneity between species for both the 18S rRNA and combined datasets, introducing the possibility of LBA. As a test case, we showed that LBA probably affected the position of Spelaeogriphacea in the combined molecular evidence analysis. We also demonstrated that several clades, including the previously reported and surprising clade of Amphipoda plus Spelaeogriphacea, are 'supported' by zero length branches. Furthermore we showed that different sets of taxa have the greatest impact upon the relationships within molecular versus morphological trees.</p> <p>Conclusion</p> <p>Rate heterogeneity and conflict between data partitions mean that existing molecular and morphological evidence is unable to resolve a well-supported eumalacostracan phylogeny. We believe that it will be necessary to look beyond the most commonly utilized sources of data (nuclear ribosomal and mitochondrial sequences) to obtain a robust tree in the future.</p
Recommended from our members
New rRNA Gene-Based Phylogenies of the Alphaproteobacteria Provide Perspective on Major Groups, Mitochondrial Ancestry and Phylogenetic Instability
Bacteria in the class Alphaproteobacteria have a wide variety of lifestyles and physiologies. They include pathogens
of humans and livestock, agriculturally valuable strains, and several highly abundant marine groups. The ancestor of
mitochondria also originated in this clade. Despite significant effort to investigate the phylogeny of the
Alphaproteobacteria with a variety of methods, there remains considerable disparity in the placement of several
groups. Recent emphasis on phylogenies derived from multiple protein-coding genes remains contentious due to
disagreement over appropriate gene selection and the potential influences of systematic error. We revisited previous
investigations in this area using concatenated alignments of the small and large subunit (SSU and LSU) rRNA genes,
as we show here that these loci have much lower GC bias than whole genomes. This approach has allowed us to
update the canonical 16S rRNA gene tree of the Alphaproteobacteria with additional important taxa that were not
previously included, and with added resolution provided by concatenating the SSU and LSU genes. We investigated
the topological stability of the Alphaproteobacteria by varying alignment methods, rate models, taxon selection and
RY-recoding to circumvent GC content bias. We also introduce RYMK-recoding and show that it avoids some of the
information loss in RY-recoding. We demonstrate that the topology of the Alphaproteobacteria is sensitive to
inclusion of several groups of taxa, but it is less affected by the choice of alignment and rate methods. The majority of
topologies and comparative results from Approximately Unbiased tests provide support for positioning the
Rickettsiales and the mitochondrial branch within a clade. This composite clade is a sister group to the abundant
marine SAR11 clade (Pelagibacterales). Furthermore, we add support for taxonomic assignment of several recently
sequenced taxa. Accordingly, we propose three subclasses within the Alphaproteobacteria: the Caulobacteridae, the
Rickettsidae, and the Magnetococcidae
Recommended from our members
ThrashJCameronMicrobiologyNewrRNAGene-Based_SupportingInformation.zip
Bacteria in the class Alphaproteobacteria have a wide variety of lifestyles and physiologies. They include pathogens
of humans and livestock, agriculturally valuable strains, and several highly abundant marine groups. The ancestor of
mitochondria also originated in this clade. Despite significant effort to investigate the phylogeny of the
Alphaproteobacteria with a variety of methods, there remains considerable disparity in the placement of several
groups. Recent emphasis on phylogenies derived from multiple protein-coding genes remains contentious due to
disagreement over appropriate gene selection and the potential influences of systematic error. We revisited previous
investigations in this area using concatenated alignments of the small and large subunit (SSU and LSU) rRNA genes,
as we show here that these loci have much lower GC bias than whole genomes. This approach has allowed us to
update the canonical 16S rRNA gene tree of the Alphaproteobacteria with additional important taxa that were not
previously included, and with added resolution provided by concatenating the SSU and LSU genes. We investigated
the topological stability of the Alphaproteobacteria by varying alignment methods, rate models, taxon selection and
RY-recoding to circumvent GC content bias. We also introduce RYMK-recoding and show that it avoids some of the
information loss in RY-recoding. We demonstrate that the topology of the Alphaproteobacteria is sensitive to
inclusion of several groups of taxa, but it is less affected by the choice of alignment and rate methods. The majority of
topologies and comparative results from Approximately Unbiased tests provide support for positioning the
Rickettsiales and the mitochondrial branch within a clade. This composite clade is a sister group to the abundant
marine SAR11 clade (Pelagibacterales). Furthermore, we add support for taxonomic assignment of several recently
sequenced taxa. Accordingly, we propose three subclasses within the Alphaproteobacteria: the Caulobacteridae, the
Rickettsidae, and the Magnetococcidae
Recommended from our members
ThrashJCameronMicrobiologyNewrRNAGene-Based.pdf
Bacteria in the class Alphaproteobacteria have a wide variety of lifestyles and physiologies. They include pathogens
of humans and livestock, agriculturally valuable strains, and several highly abundant marine groups. The ancestor of
mitochondria also originated in this clade. Despite significant effort to investigate the phylogeny of the
Alphaproteobacteria with a variety of methods, there remains considerable disparity in the placement of several
groups. Recent emphasis on phylogenies derived from multiple protein-coding genes remains contentious due to
disagreement over appropriate gene selection and the potential influences of systematic error. We revisited previous
investigations in this area using concatenated alignments of the small and large subunit (SSU and LSU) rRNA genes,
as we show here that these loci have much lower GC bias than whole genomes. This approach has allowed us to
update the canonical 16S rRNA gene tree of the Alphaproteobacteria with additional important taxa that were not
previously included, and with added resolution provided by concatenating the SSU and LSU genes. We investigated
the topological stability of the Alphaproteobacteria by varying alignment methods, rate models, taxon selection and
RY-recoding to circumvent GC content bias. We also introduce RYMK-recoding and show that it avoids some of the
information loss in RY-recoding. We demonstrate that the topology of the Alphaproteobacteria is sensitive to
inclusion of several groups of taxa, but it is less affected by the choice of alignment and rate methods. The majority of
topologies and comparative results from Approximately Unbiased tests provide support for positioning the
Rickettsiales and the mitochondrial branch within a clade. This composite clade is a sister group to the abundant
marine SAR11 clade (Pelagibacterales). Furthermore, we add support for taxonomic assignment of several recently
sequenced taxa. Accordingly, we propose three subclasses within the Alphaproteobacteria: the Caulobacteridae, the
Rickettsidae, and the Magnetococcidae
Box plot of the distributions of GC content for the genomes, SSU and LSU rRNA genes for members of the primary clade headed by <i>Caulobacter</i> (C), the clade headed by <i>Rickettsia</i> (R) and the mitochondria (M).
<p>Boxes indicate the interquartile range, with adjacent values as whiskers, outlying values as circles; median indicated by the horizontal black line. Box width is proportional to the total number of samples.</p
Tree inferred with the Arb-sina aligned complete dataset under a GTRÎ model.
<p>Bootstrap values (n = 1000) are indicated at the nodes. Red arrows indicate how a taxon or clade differs in the other regularly coded trees, with values in square brackets indicating in how many trees this is seen. If there are one or more differences within a family, this is indicated after the name of the family. The leaves of the phylogram are collapsed into taxonomic families and into the host phyla for mitochondria. The internal topology of the <i>Rhodospirillales</i> order is not the same in all primary trees, therefore it has been expanded to show all leaves (inset).</p
Relationship of rRNA gene vs. genomic GC content for <i>Alphaproteobacteria</i> and mitochondria.
<p>The rRNA gene GC content was calculated from the entire sequence using a perl script, while the genomic GC content was taken from the IMG database. The âother ordersâ group includes the <i>Caulobacterales</i>, <i>Sphingomonadales</i>, <i>Rhizobiales</i>, <i>Rhodobacterales</i> and <i>Parvularculales</i>.</p
Fragmenstein: predicting protein-ligand structures of compounds derived from known crystallographic fragment hits using a strict conserved-bindingâbased methodology
Current strategies centred on either merging or linking initial hits from fragment-based drug design (FBDD) crystallographic screens ignore 3D structural information. We show that an algorithmic approach (Fragmenstein) that âstitchesâ the ligand atoms from this structural information together can provide more accurate and reliable predictions for protein-ligand complex conformation than existing methods such as pharmacophore-constrained docking. This approach works under the assumption of conserved binding: when a larger molecule is designed containing the initial fragment hit, the common substructure between the two will adopt the same binding mode. Fragmenstein either takes the coordinates of ligands from a experimental fragment screen and stitches the atoms together to produce a novel merged compound, or uses them to predict the complex for a provided compound. The compound is then energy minimised under strong constraints to obtain a structurally plausible compound. This method is successful in showing the importance of using the coordinates of known binders when predicting the conformation of derivative compounds through a retrospective analysis of the COVID Moonshot data. It has also had a real-world application in hit-to-lead screening, yielding a sub-micromolar merger from parent hits in a single round