8 research outputs found

    Network dynamics of eukaryotic LTR retroelements beyond phylogenetic trees

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Sequencing projects have allowed diverse retroviruses and LTR retrotransposons from different eukaryotic organisms to be characterized. It is known that retroviruses and other retro-transcribing viruses evolve from LTR retrotransposons and that this whole system clusters into five families: <it>Ty3/Gypsy, Retroviridae, Ty1/Copia, Bel/Pao </it>and <it>Caulimoviridae</it>. Phylogenetic analyses usually show that these split into multiple distinct lineages but what is yet to be understood is how deep evolution occurred in this system.</p> <p>Results</p> <p>We combined phylogenetic and graph analyses to investigate the history of LTR retroelements both as a tree and as a network. We used 268 non-redundant LTR retroelements, many of them introduced for the first time in this work, to elucidate all possible LTR retroelement phylogenetic patterns. These were superimposed over the tree of eukaryotes to investigate the dynamics of the system, at distinct evolutionary times. Next, we investigated phenotypic features such as duplication and variability of amino acid motifs, and several differences in genomic ORF organization. Using this information we characterized eight reticulate evolution markers to construct phenotypic network models.</p> <p>Conclusion</p> <p>The evolutionary history of LTR retroelements can be traced as a time-evolving network that depends on phylogenetic patterns, epigenetic host-factors and phenotypic plasticity. The <it>Ty1/Copia </it>and the <it>Ty3/Gypsy </it>families represent the oldest patterns in this network that we found mimics eukaryotic macroevolution. The emergence of the <it>Bel/Pao, Retroviridae </it>and <it>Caulimoviridae </it>families in this network can be related with distinct inflations of the <it>Ty3/Gypsy </it>family, at distinct evolutionary times. This suggests that <it>Ty3/Gypsy </it>ancestors diversified much more than their <it>Ty1/Copia </it>counterparts, at distinct geological eras. Consistent with the principle of preferential attachment, the connectivities among phenotypic markers, taken as network-represented combinations, are power-law distributed. This evidences an inflationary mode of evolution where the system diversity; 1) expands continuously alternating vertical and gradual processes of phylogenetic divergence with episodes of modular, saltatory and reticulate evolution; 2) is governed by the intrinsic capability of distinct LTR retroelement host-communities to self-organize their phenotypes according to emergent laws characteristic of complex systems.</p> <p>Reviewers</p> <p>This article was reviewed by Eugene V. Koonin, Eric Bapteste, and Enmanuelle Lerat (nominated by King Jordan)</p

    A user guide for the online exploration and visualization of PCAWG data.

    Get PDF
    Funder: U.S. Department of Health & Human Services | NIH | National Cancer Institute (NCI)Funder: Ontario Institute for Cancer Research (Institut Ontarien de Recherche sur le Cancer); doi: https://doi.org/10.13039/100012118Funder: EMBL Member States EU FP7 Programme projects EurocanPlatform (260791) CAGEKID (241669)Funder: European Union’s Framework Programme For Research and Innovation Horizon 2020 under the Marie Sklodowska-Curie grant agreement no. 703543Funder: Michael & Susan Dell Foundation; Mary K. Chapman Foundation; CCSG Grant P30 CA016672 (Bioinformatics Shared Resource); ITCR U24 CA199461; GDAN U24 CA210949; GDAN U24 CA210950Funder: European Commission's H2020 Programme, project SOUND, Grant Agreement no 633974Funder: Spanish Government (SEV 2015-0493) BSC-Lenovo Master Collaboration Agreement (2015)The Pan-Cancer Analysis of Whole Genomes (PCAWG) project generated a vast amount of whole-genome cancer sequencing resource data. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumor types, we provide a user's guide to the five publicly available online data exploration and visualization tools introduced in the PCAWG marker paper. These tools are ICGC Data Portal, UCSC Xena, Chromothripsis Explorer, Expression Atlas, and PCAWG-Scout. We detail use cases and analyses for each tool, show how they incorporate outside resources from the larger genomics ecosystem, and demonstrate how the tools can be used together to understand the biology of cancers more deeply. Together, the tools enable researchers to query the complex genomic PCAWG data dynamically and integrate external information, enabling and enhancing interpretation

    Expression Atlas update: gene and protein expression in multiple species.

    Get PDF
    The EMBL-EBI Expression Atlas is an added value knowledge base that enables researchers to answer the question of where (tissue, organism part, developmental stage, cell type) and under which conditions (disease, treatment, gender, etc) a gene or protein of interest is expressed. Expression Atlas brings together data from >4500 expression studies from >65 different species, across different conditions and tissues. It makes these data freely available in an easy to visualise form, after expert curation to accurately represent the intended experimental design, re-analysed via standardised pipelines that rely on open-source community developed tools. Each study's metadata are annotated using ontologies. The data are re-analyzed with the aim of reproducing the original conclusions of the underlying experiments. Expression Atlas is currently divided into Bulk Expression Atlas and Single Cell Expression Atlas. Expression Atlas contains data from differential studies (microarray and bulk RNA-Seq) and baseline studies (bulk RNA-Seq and proteomics), whereas Single Cell Expression Atlas is currently dedicated to Single Cell RNA-Sequencing (scRNA-Seq) studies. The resource has been in continuous development since 2009 and it is available at https://www.ebi.ac.uk/gxa

    GyDB mobilomics: LTR retroelements and integrase-related transposons of the pea aphid Acyrthosiphon pisum genome

    No full text
    The Gypsy Database concerning Mobile Genetic Elements (release 2.0) is a wiki-style project devoted to the phylogenetic classification of LTR retroelements and their viral and host gene relatives characterized from distinct organisms. Furthermore, GyDB 2.0 is concerned with studying mobile elements within genomes. Therefore, an in-progress repository was created for databases with annotations of mobile genetic elements from particular genomes. This repository is called Mobilomics and the first uploaded database contains 549 LTR retroelements and related transposases which have been annotated from the genome of the Pea aphid Acyrthosiphon pisum. Mobilomics is accessible from the GyDB 2.0 project using the URL: http://gydb.org/index.php/Mobilomics
    corecore