25 research outputs found

    Eumalacostracan phylogeny and total evidence: limitations of the usual suspects

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The phylogeny of Eumalacostraca (Crustacea) remains elusive, despite over a century of interest. Recent morphological and molecular phylogenies appear highly incongruent, but this has not been assessed quantitatively. Moreover, 18S rRNA trees show striking branch length differences between species, accompanied by a conspicuous clustering of taxa with similar branch lengths. Surprisingly, previous research found no rate heterogeneity. Hitherto, no phylogenetic analysis of all major eumalacostracan taxa (orders) has either combined evidence from multiple loci, or combined molecular and morphological evidence.</p> <p>Results</p> <p>We combined evidence from four nuclear ribosomal and mitochondrial loci (18S rRNA, 28S rRNA, 16S rRNA, and cytochrome <it>c </it>oxidase subunit I) with a newly synthesized morphological dataset. We tested the homogeneity of data partitions, both in terms of character congruence and the topological congruence of inferred trees. We also performed Bayesian and parsimony analyses on separate and combined partitions, and tested the contribution of each partition. We tested for potential long-branch attraction (LBA) using taxon deletion experiments, and with relative rate tests. Additionally we searched for molecular polytomies (spurious clades). Lastly, we investigated the phylogenetic stability of taxa, and assessed their impact on inferred relationships over the whole tree. We detected significant conflict between data partitions, especially between morphology and molecules. We found significant rate heterogeneity between species for both the 18S rRNA and combined datasets, introducing the possibility of LBA. As a test case, we showed that LBA probably affected the position of Spelaeogriphacea in the combined molecular evidence analysis. We also demonstrated that several clades, including the previously reported and surprising clade of Amphipoda plus Spelaeogriphacea, are 'supported' by zero length branches. Furthermore we showed that different sets of taxa have the greatest impact upon the relationships within molecular versus morphological trees.</p> <p>Conclusion</p> <p>Rate heterogeneity and conflict between data partitions mean that existing molecular and morphological evidence is unable to resolve a well-supported eumalacostracan phylogeny. We believe that it will be necessary to look beyond the most commonly utilized sources of data (nuclear ribosomal and mitochondrial sequences) to obtain a robust tree in the future.</p

    Tree inferred with the Arb-sina aligned complete dataset under a GTRΓ model.

    No full text
    <p>Bootstrap values (n = 1000) are indicated at the nodes. Red arrows indicate how a taxon or clade differs in the other regularly coded trees, with values in square brackets indicating in how many trees this is seen. If there are one or more differences within a family, this is indicated after the name of the family. The leaves of the phylogram are collapsed into taxonomic families and into the host phyla for mitochondria. The internal topology of the <i>Rhodospirillales</i> order is not the same in all primary trees, therefore it has been expanded to show all leaves (inset).</p

    Box plot of the distributions of GC content for the genomes, SSU and LSU rRNA genes for members of the primary clade headed by <i>Caulobacter</i> (C), the clade headed by <i>Rickettsia</i> (R) and the mitochondria (M).

    No full text
    <p>Boxes indicate the interquartile range, with adjacent values as whiskers, outlying values as circles; median indicated by the horizontal black line. Box width is proportional to the total number of samples.</p

    Relationship of rRNA gene vs. genomic GC content for <i>Alphaproteobacteria</i> and mitochondria.

    No full text
    <p>The rRNA gene GC content was calculated from the entire sequence using a perl script, while the genomic GC content was taken from the IMG database. The “other orders” group includes the <i>Caulobacterales</i>, <i>Sphingomonadales</i>, <i>Rhizobiales</i>, <i>Rhodobacterales</i> and <i>Parvularculales</i>.</p

    Fragmenstein: predicting protein-ligand structures of compounds derived from known crystallographic fragment hits using a strict conserved-binding–based methodology

    No full text
    Current strategies centred on either merging or linking initial hits from fragment-based drug design (FBDD) crystallographic screens ignore 3D structural information. We show that an algorithmic approach (Fragmenstein) that ‘stitches’ the ligand atoms from this structural information together can provide more accurate and reliable predictions for protein-ligand complex conformation than existing methods such as pharmacophore-constrained docking. This approach works under the assumption of conserved binding: when a larger molecule is designed containing the initial fragment hit, the common substructure between the two will adopt the same binding mode. Fragmenstein either takes the coordinates of ligands from a experimental fragment screen and stitches the atoms together to produce a novel merged compound, or uses them to predict the complex for a provided compound. The compound is then energy minimised under strong constraints to obtain a structurally plausible compound. This method is successful in showing the importance of using the coordinates of known binders when predicting the conformation of derivative compounds through a retrospective analysis of the COVID Moonshot data. It has also had a real-world application in hit-to-lead screening, yielding a sub-micromolar merger from parent hits in a single round
    corecore