19 research outputs found

    Les éléments transposables, de leur annotation à leur intégration en graphes de connaissance

    No full text
    National audienceTransposable elements (TEs) are major players of structure and evolution of eukaryote genomes. Thanks to their ability to move around and to replicate within genomes, they are probably the most important contributors to genome plasticity. Individuals of the same species independently undergo TE insertions causing inter-individual genetic variability. This variability between individuals is the basis of the natural selection that leads to an increased adaptation of individuals to their environment. A way to search for the potential role of TEs in host adaptation is through a pangenomic approach.The REPET package integrates bioinformatics pipelines dedicated to detect and annotate TEs in genomes. Then the PanREPET pipeline allows to describe (i) TE insertions present in all individuals of the species (core-genome), (ii) insertions present only among a subset of individuals (dispensable-genome) or (iii) ecogenome when the individuals share the same environment, and finally (iv) insertions specific to an individual.To identify TE candidate putatively involved in local adaptation, environmental knowledge and genome annotations have been integrated in a semantic knowledge graph

    An integrated information system dedicated to oak genomics and genetics

    No full text
    GnpIS is an information system designed to integrate and link genomic, genetic and environmental data into a single environment dedicated to plant (crops and forest trees) and fungi data. GnpIS is regularly improved with new functionalities answering specific needs raised by scientists and released several times a year. We propose to illustrate the integrated genome annotation system we set up with a focus on the interoperability between genomic and genetic data (e.g. Markers, QTL) present in GnpIS-core, through the use case Quercus robur (the pedunculate oak), a large, complex and highly heterozygous genome. This genome annotation system relies on GMOD interfaces such as WebApollo/JBrowse and Intermine to make these data available under a user-friendly environment. All annotations and analysis results (Transposable Elements (TEs), genes, ncRNA ...) and functional annotation (protein-coding genes) were obtained using powerful and robust pipelines: (i) REPET used to detect, classify and annotate TEs representing 50% of the genome; (ii) Eugene which integrates ab initio and similarity gene finding softwares to predict gene models; (iii) ncRNA were annotated using different tools to annotate lncRNA, miRNA, rRNA, tRNA (iv) A functional annotation pipeline mainly based on Interproscan and comparative genomics was performed on the 25,808 highly confident predicted proteins. This system allows experts to analyze their protein families of interest and curate/validate gene structure. All together these resources provide a framework to study the two key evolutionary processes that explain the remarkable diversity found within the Quercus genus: local adaptation and speciation

    RepetDB: a unified resource for transposable element references

    No full text
    International audienceTransposable elements (TEs) are major players in the structure and evolution of eukaryote genomes. Thanksto their ability to move around and replicate within genomes, they are probably the most important contributorsto genome plasticity. The insertion of TEs close to genes can affect gene structure, expression and function,contributing to the genetic diversity underlying species adaptation. Many studies have shown that TEs aregenerally silenced through epigenetic defense mechanisms, and that these elements play an important role inepigenetic genome regulation. Their detection and annotation are considered essential and must be undertakenin the frame of any genome sequencing project.Here, we will present the new version of RepetDB [1] (Amselem et al., Mobile DNA, 2019),(https://urgi.versailles.inrae.fr/repetdb) our TE database developed to store and retrieve detected, classifiedand annotated TEs in a standardized manner. This RepetDB v2 new version was updated with 31 more speciesof plants and fungi and provides TE consensi with evidences able to justify their classification.RepetDB v2 is a customized implementation of InterMine [2,3], an open-source data warehouse frameworkused here to store, search, browse, analyze and compare all the data recorded for each TE reference sequence.InterMine provides powerful capabilities to query and visualize all biological information on TE. It allows tomake simple search on the database using the QuickSearch (‘google like search’) or make more complexqueries using the Querybuilder to display various desired information.RepetDB v2 is designed to be a TE knowledge base populated with full de novo TE annotations of complete(or near-complete) genome sequences. Indeed, the description and classification of TEs facilitates theexploration of specific TE families, superfamilies or orders across a large range of species. It also makespossible cross-species searches and comparisons of TE family content between genomes.References1. Amselem, J., Cornut, G., Choisne, N., Alaux, M., Alfama-Depauw, F., Jamilloux, V., Maumus, F., Letellier, T.,Luyten, I., Pommier, C., Adam-Blondon, A. F., & Quesneville, H. (2019). RepetDB: a unified resource fortransposable element references. Mobile DNA, 10, 6. https://doi.org/10.1186/s13100-019-0150-y2. InterMine: extensive web services for modern biology. Kalderimis A, Lyne R, Butano D, Contrino S, Lyne M,Heimbach J, Hu F, Smith R, Stˇepán R, Sullivan J, Micklem G. Nucleic Acids Res. 2014 Jul; 42 (Web Serverissue): W468-723. InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data.Smith RN, Aleksic J, Butano D, Carr A, Contrino S, Hu F, Lyne M, Lyne R, Kalderimis A, Rutherford K, StepanR, Sullivan J, Wakeling M, Watkins X, Micklem G. Bioinformatics (2012) 28 (23): 3163-3165

    RepetDB: a unified resource for transposable element references

    No full text
    International audienceTransposable elements (TEs) are major players in the structure and evolution of eukaryote genomes. Thanksto their ability to move around and replicate within genomes, they are probably the most important contributorsto genome plasticity. The insertion of TEs close to genes can affect gene structure, expression and function,contributing to the genetic diversity underlying species adaptation. Many studies have shown that TEs aregenerally silenced through epigenetic defense mechanisms, and that these elements play an important role inepigenetic genome regulation. Their detection and annotation are considered essential and must be undertakenin the frame of any genome sequencing project.Here, we will present the new version of RepetDB [1] (Amselem et al., Mobile DNA, 2019),(https://urgi.versailles.inrae.fr/repetdb) our TE database developed to store and retrieve detected, classifiedand annotated TEs in a standardized manner. This RepetDB v2 new version was updated with 31 more speciesof plants and fungi and provides TE consensi with evidences able to justify their classification.RepetDB v2 is a customized implementation of InterMine [2,3], an open-source data warehouse frameworkused here to store, search, browse, analyze and compare all the data recorded for each TE reference sequence.InterMine provides powerful capabilities to query and visualize all biological information on TE. It allows tomake simple search on the database using the QuickSearch (‘google like search’) or make more complexqueries using the Querybuilder to display various desired information.RepetDB v2 is designed to be a TE knowledge base populated with full de novo TE annotations of complete(or near-complete) genome sequences. Indeed, the description and classification of TEs facilitates theexploration of specific TE families, superfamilies or orders across a large range of species. It also makespossible cross-species searches and comparisons of TE family content between genomes.References1. Amselem, J., Cornut, G., Choisne, N., Alaux, M., Alfama-Depauw, F., Jamilloux, V., Maumus, F., Letellier, T.,Luyten, I., Pommier, C., Adam-Blondon, A. F., & Quesneville, H. (2019). RepetDB: a unified resource fortransposable element references. Mobile DNA, 10, 6. https://doi.org/10.1186/s13100-019-0150-y2. InterMine: extensive web services for modern biology. Kalderimis A, Lyne R, Butano D, Contrino S, Lyne M,Heimbach J, Hu F, Smith R, Stˇepán R, Sullivan J, Micklem G. Nucleic Acids Res. 2014 Jul; 42 (Web Serverissue): W468-723. InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data.Smith RN, Aleksic J, Butano D, Carr A, Contrino S, Hu F, Lyne M, Lyne R, Kalderimis A, Rutherford K, StepanR, Sullivan J, Wakeling M, Watkins X, Micklem G. Bioinformatics (2012) 28 (23): 3163-3165

    A computational architecture designed for genome annotation: oak genome sequencing project as a use case

    No full text
    The ANR Genoak project aims to study the two key evolutionary processes that explain the remarkable diversity found within the oak genus. We performed anautomated structural annotation (transposable elements (TEs) and genes) and functional annotation of predicted genes using robust pipelines i/ REPET for TEs ii/Eugene for gene prediction iii/ FunAnnotPipe (in-house pipeline) mainly based on InterproScan for functional annotation. Further objectives were to: i/ integrate thewhole genome with all the features annotated into a Genome Browser, ii/ provide an interface for gene prediction curation/validation, and iii/ provide an informationsystem pointing towards accessibility and interoperability
    corecore