40 research outputs found

    apex: phylogenetics with multiple genes.

    Get PDF
    Genetic sequences of multiple genes are becoming increasingly common for a wide range of organisms including viruses, bacteria and eukaryotes. While such data may sometimes be treated as a single locus, in practice, a number of biological and statistical phenomena can lead to phylogenetic incongruence. In such cases, different loci should, at least as a preliminary step, be examined and analysed separately. The r software has become a popular platform for phylogenetics, with several packages implementing distance-based, parsimony and likelihood-based phylogenetic reconstruction, and an even greater number of packages implementing phylogenetic comparative methods. Unfortunately, basic data structures and tools for analysing multiple genes have so far been lacking, thereby limiting potential for investigating phylogenetic incongruence. In this study, we introduce the new r package apex to fill this gap. apex implements new object classes, which extend existing standards for storing DNA and amino acid sequences, and provides a number of convenient tools for handling, visualizing and analysing these data. In this study, we introduce the main features of the package and illustrate its functionalities through the analysis of a simple data set

    Multiple origins of green blood in New Guinea lizards

    Get PDF
    © 2018 The Authors. Several species of lizards from the megadiverse island of New Guinea have evolved green blood. An unusually high concentration of the green bile pigment biliverdin in the circulatory system of these lizards makes the blood, muscles, bones, tongue, and mucosal tissues bright green in color, eclipsing the crimson color from their red blood cells. This is a remarkable physiological feature because bile pigments are toxic physiological waste products of red blood cell catabolism and, when chronically elevated, cause jaundice in humans and all other vertebrates. Although these lizards offer a promising system to examine the evolution of extraordinary physiological characteristics, little is known about the phylogenetic relationships of green-blooded lizards or the evolutionary origins of green blood. We present the first extensive phylogeny for green-blooded lizards and closely related Australasian lizards using thousands of genomic regions to examine the evolutionary history of this unusual trait. Maximum likelihood ancestral character state reconstruction supports four independent origins of green blood. Our results lay the phylogenetic foundation necessary to determine the role, if any, of natural selection in shaping this enigmatic physiological trait as well as understanding the genetic, proteomic, and biochemical basis for the lack of jaundice in those species that have independently evolved green blood

    Comparative genomics of Mycobacterium africanum Lineage 5 and Lineage 6 from Ghana suggests distinct ecological niches.

    Get PDF
    Mycobacterium africanum (Maf) causes a substantial proportion of human tuberculosis in some countries of West Africa, but little is known on this pathogen. We compared the genomes of 253 Maf clinical isolates from Ghana, including N = 175 Lineage 5 (L5) and N = 78 Lineage 6 (L6). We found that the genomic diversity of L6 was higher than in L5 despite the smaller sample size. Regulatory proteins appeared to evolve neutrally in L5 but under purifying selection in L6. Even though over 90% of the human T cell epitopes were conserved in both lineages, L6 showed a higher ratio of non-synonymous to synonymous single nucleotide variation in these epitopes overall compared to L5. Of the 10% human T cell epitopes that were variable, most carried mutations that were lineage-specific. Our findings indicate that Maf L5 and L6 differ in some of their population genomic characteristics, possibly reflecting different selection pressures linked to distinct ecological niches

    Islands of linkage in an ocean of pervasive recombination reveals two-speed evolution of human cytomegalovirus genomes

    Get PDF
    Human cytomegalovirus (HCMV) infects most of the population worldwide, persisting throughout the host's life in a latent state with periodic episodes of reactivation. While typically asymptomatic, HCMV can cause fatal disease among congenitally infected infants and immunocompromised patients. These clinical issues are compounded by the emergence of antiviral resistance and the absence of an effective vaccine, the development of which is likely complicated by the numerous immune evasins encoded by HCMV to counter the host's adaptive immune responses, a feature that facilitates frequent super-infections. Understanding the evolutionary dynamics of HCMV is essential for the development of effective new drugs and vaccines. By comparing viral genomes from uncultivated or low-passaged clinical samples of diverse origins, we observe evidence of frequent homologous recombination events, both recent and ancient, and no structure of HCMV genetic diversity at the whole-genome scale. Analysis of individual gene-scale loci reveals a striking dichotomy: while most of the genome is highly conserved, recombines essentially freely and has evolved under purifying selection, 21 genes display extreme diversity, structured into distinct genotypes that do not recombine with each other. Most of these hyper-variable genes encode glycoproteins involved in cell entry or escape of host immunity. Evidence that half of them have diverged through episodes of intense positive selection suggests that rapid evolution of hyper-variable loci is likely driven by interactions with host immunity. It appears that this process is enabled by recombination unlinking hyper-variable loci from strongly constrained neighboring sites. It is conceivable that viral mechanisms facilitating super-infection have evolved to promote recombination between diverged genotypes, allowing the virus to continuously diversify at key loci to escape immune detection, while maintaining a genome optimally adapted to its asymptomatic infectious lifecycle

    Unsupervised machine learning and geometric morphometrics as tools for the identification of inter and intraspecific variations in the Anopheles Maculipennis complex

    Get PDF
    Geometric morphometric analysis was combined with two different unsupervised machine learning algorithms, UMAP and HDBSCAN, to visualize morphological differences in wing shape among and within four Anopheles sibling species (An. atroparvus, An. melanoon, An. maculipennis s.s. and An. daciae sp. inq.) of the Maculipennis complex in Northern Italy. Specifically, we evaluated: (1) wing shape variation among and within species; (2) the consistencies between groups of An. maculipennis s.s. and An. daciae sp. inq. identified based on COI sequences and wing shape variability; and (3) the spatial and temporal distribution of different morphotypes. UMAP detected at least 13 main patterns of variation in wing shape among the four analyzed species and mapped intraspecific morphological variations. The relationship between the most abundant COI haplotypes of An. daciae sp. inq. and shape ordination/variation was not significant. However, morphological variation within haplotypes was reported. HDBSCAN also recognized different clusters of morphotypes within An. daciae sp. inq. (12) and An. maculipennis s.s. (4). All morphotypes shared a similar pattern of variation in the subcostal vein, in the anal vein and in the radio-medial cross-vein of the wing. On the contrary, the marginal part of the wings remained unchanged in all clusters of both species. Any spatial-temporal significant difference was observed in the frequency of the identified morphotypes. Our study demonstrated that machine learning algorithms are a useful tool combined with geometric morphometrics and suggest to deepen the analysis of inter and intra specific shape variability to evaluate evolutionary constrains related to wing functionality

    Population genomics and phylogeography of a benthic coastal shark (Scyliorhinus canicula) using 2b-RAD single nucleotide polymorphisms

    Get PDF
    The existence of strong genetic structure is expected in species with limited ability to disperse and philopatric behaviour. These life-history traits are found in many small benthic elasmobranchs, such as in the small-spotted catshark (Scyliorhinus canicula). However, no evidence of genetic structure was found across its northeastern Atlantic (NEA) range using traditional molecular markers. Here, fine-scale genetic differentiation was detected between the British Isles and southern Iberia using 2674 single nucleotide polymorphism loci generated using 2b-restriction site-associated DNA (2b-RAD). Geographical distance and historical demography were two major drivers shaping the distribution of genetic diversity of S. canicula along the NEA. Significant positive spatial autocorrelation of allelic frequencies was detected, with genetic differentiation generally increasing with geographical distance. However, marked genetic divergence of the Celtic Sea and South Portugal collections from their closest neighbours resulted in geographically constrained genetic breaks south of the British Isles and off southwestern Iberia. Historical demographic reconstruction of population pairs across these genetic breaks suggested a scenario of historical isolation before secondary contact, probably related to distinct northern and southern glacial refugia. These results provide new insights into the population structure of S. canicula along the NEA and serve as a reference for benthic elasmobranchs with similar distribution ranges

    ORTHOSCOPE: An Automatic Web Tool for Phylogenetically Inferring Bilaterian Orthogroups with User-Selected Taxa

    Get PDF
    Identification of orthologous or paralogous relationships of coding genes is fundamental to all aspects of comparative genomics. For accurate identification of orthologs among deeply diversified bilaterian lineages, precise estimation of gene trees is indispensable, given the complicated histories of genes over millions of years. By estimating gene trees, orthologs can be identified as members of an orthogroup, a set of genes descended from a single gene in the last common ancestor of all the species being considered. In addition to comparisons with a given species tree, purposeful taxonomic sampling increases the accuracy of gene tree estimation and orthogroup identification. Although some major phylogenetic relationships of bilaterians are gradually being unraveled, the scattering of published genomic data among separate web databases is becoming a significant hindrance to identification of orthogroups with appropriate taxonomic sampling. By integrating more than 250 metazoan gene models predicted in genome projects, we developed a web tool called ORTHOSCOPE to identify orthogroups of specific protein-coding genes within major bilaterian lineages. ORTHOSCOPE allows users to employ several sequences of a specific molecule and broadly accepted nodes included in a user-specified species tree as queries and to evaluate the reliability of estimated orthogroups based on topologies and node support values of estimated gene trees. A test analysis using data from 36 bilaterians was accomplished within 140 s. ORTHOSCOPE results can be used to evaluate orthologs identified by other stand-alone programs using genome-scale data. ORTHOSCOPE is freely available at https://www.orthoscope.jp or https://github.com/jun-inoue/orthoscope (last accessed December 28, 2018)
    corecore