11 research outputs found

    IWGSC Sequence Repository: Moving towards tools to facilitate data integration for the reference sequence of wheat

    Get PDF
    URGI is a genomics and bioinformatics research unit at INRA (French National institute for Agricultural Research), dedicated to plants and crop parasites. We develop and maintain a genomic and genetic Information System called GnpIS that manages multiple types of wheat data. Under the umbrella of the IWGSC (International Wheat Genome Sequencing Consortium), we have set up a Sequence Repository on the Wheat@URGI website to store, browse and BLAST the data being generated by the wheat genome project: http://wheat-urgi.versailles.inra.fr/Seq-Repository. The repository holds the wheat physical maps, the chromosome survey sequence data for the individual chromosomes of breadwheat, draft sequences for diploid and tetraploid wheats and provides browsable access to the BAC-based reference sequence for chromosome 3B, the first of the chromosomes to be completed by the consortium. I will highlight the new features and data available in the Sequence Repository (e.g., new BLAST functionalities) and, in particular, present what we have done to address needs and concerns raised during the IWGSC S&P workshop last year. In addition, I will open the discussion about the future needs for tools to facilitate the integration of data to produce the reference sequence

    Tracheophyte genomes keep track of the deep evolution of the <em>Caulimoviridae</em>

    Get PDF
    International audienceEndogenous viral elements (EVEs) are viral sequences that are integrated in the nuclear genomes of their hosts and are signatures of viral infections that may have occurred millions of years ago. The study of EVEs, coined paleovirology, provides important insights into virus evolution. The Caulimoviridae is the most common group of EVEs in plants, although their presence has often been overlooked in plant genome studies. We have refined methods for the identification of caulimovirid EVEs and interrogated the genomes of a broad diversity of plant taxa, from algae to advanced flowering plants. Evidence is provided that almost every vascular plant (tracheophyte), including the most primitive taxa (clubmosses, ferns and gymnosperms) contains caulimovirid EVEs, many of which represent previously unrecognized evolutionary branches. In angiosperms, EVEs from at least one and as many as five different caulimovirid genera were frequently detected, and florendoviruses were the most widely distributed, followed by petuviruses. From the analysis of the distribution of different caulimovirid genera within different plant species, we propose a working evolutionary scenario in which this family of viruses emerged at latest during Devonian era (approx. 320 million years ago) followed by vertical transmission and by several cross-division host swaps

    RepetDB: a unified resource for transposable element references

    No full text
    Background: Thanks to their ability to move around and replicate within genomes, transposable elements (TEs) are perhaps the most important contributors to genome plasticity and evolution. Their detection and annotation are considered essential in any genome sequencing project. The number of fully sequenced genomes is rapidly increasing with improvements in high-throughput sequencing technologies. A fully automated de novo annotation process for TEs is therefore required to cope with the deluge of sequence data. However, all automated procedures are error-prone, and an automated procedure for TE identification and classification would be no exception. It is therefore crucial to provide not only the TE reference sequences, but also evidence justifying their classification, at the scale of the whole genome. A few TE databases already exist, but none provides evidence to justify TE classification. Moreover, biological information about the sequences remains globally poor. Results: We present here the RepetDB database developed in the framework of GnpIS, a genetic and genomic information system. RepetDB is designed to store and retrieve detected, classified and annotated TEs in a standardized manner. RepetDB is an implementation with extensions of InterMine, an open-source data warehouse framework used here to store, search, browse, analyze and compare all the data recorded for each TE reference sequence. InterMine can display diverse information for each sequence and allows simple to very complex queries. Finally, TE data are displayed via a worldwide data discovery portal. RepetDB is accessible at urgi.versailles.inra.fr/repetdb. Conclusions: RepetDB is designed to be a TE knowledge base populated with full de novo TE annotations of complete (or near-complete) genome sequences. Indeed, the description and classification of TEs facilitates the exploration of specific TE families, superfamilies or orders across a large range of species. It also makes possible cross-species searches and comparisons of TE family content between genomes

    REPET

    No full text
    The REPET package (Flutre T. et al, 2011, Quesneville H. et al. 2005) integrates bioinformatics programs in order to tackle biological issues at the genomic scale. Its two main pipelines are dedicated to detect, annotate and analyze repeats in genomic sequences, specifically designed for transposable elements (TEs)

    A computational architecture designed for genome annotation: oak genome sequencing project as a use case

    No full text
    The ANR Genoak project aims to study the two key evolutionary processes that explain the remarkable diversity found within the oak genus. We performed anautomated structural annotation (transposable elements (TEs) and genes) and functional annotation of predicted genes using robust pipelines i/ REPET for TEs ii/Eugene for gene prediction iii/ FunAnnotPipe (in-house pipeline) mainly based on InterproScan for functional annotation. Further objectives were to: i/ integrate thewhole genome with all the features annotated into a Genome Browser, ii/ provide an interface for gene prediction curation/validation, and iii/ provide an informationsystem pointing towards accessibility and interoperability

    The IWGSC Reference genome browser, data mining and beyond

    No full text
    International audienceURGI is a genomics and bioinformatics research unit at INRA (French National institute for Agricultural Research), dedicated to plants and crop parasites. We develop and maintain a genomic and genetic Information System called GnpIS that manages multiple types of wheat data. Under the umbrella of the IWGSC (International Wheat Genome Sequencing Consortium), we have set up a Sequence Repository on the Wheat@URGI website to store, browse and query the data being generated by the consortium: http://wheat-urgi.versailles.inra.fr/Seq-Repository. The repository holds the wheat physical maps and sequences, especially the gold standard IWGSC reference sequence of all the pseudomolecules. We set-up dedicated tools: a genome browser to display the reference sequence and his incoming annotations. a BLAST server to query the sequence and provide links to browsers. a InterMine datawarehouse to integrate the genomics data with genetics and phenomics data to go beyond

    A computational architecture designed for genome annotation: oak genome sequencing project as a use case

    No full text
    The ANR Genoak project aims to study the two key evolutionary processes that explain the remarkable diversity found within the oak genus. We performed anautomated structural annotation (transposable elements (TEs) and genes) and functional annotation of predicted genes using robust pipelines i/ REPET for TEs ii/Eugene for gene prediction iii/ FunAnnotPipe (in-house pipeline) mainly based on InterproScan for functional annotation. Further objectives were to: i/ integrate thewhole genome with all the features annotated into a Genome Browser, ii/ provide an interface for gene prediction curation/validation, and iii/ provide an informationsystem pointing towards accessibility and interoperability

    A chromosome-based draft sequence of the hexaploid bread wheat (Triticum aestivum) genome

    No full text
    La liste complète des auteurs et leurs affiliations sont disponibles à la fin de l'article - 96 collaborateurs : Mayer KF, Rogers J, Doležel J, Pozniak C, Eversole K, Feuillet C, Gill B, Friebe B, Lukaszewski AJ, Sourdille P, Endo TR, Kubaláková M, Cíhalíková J, Dubská Z, Vrána J, Sperková R, Simková H, Febrer M, Clissold L, McLay K, Singh K, Chhuneja P, Singh NK, Khurana J, Akhunov E, Choulet F, Alberti A, Barbe V, Wincker P, Kanamori H, Kobayashi F, Itoh T, Matsumoto T, Sakai H, Tanaka T, Wu J, Ogihara Y, Handa H, Maclachlan PR, Sharpe A, Klassen D, Edwards D, Batley J, Olsen OA, Sandve SR, Lien S, Steuernagel B, Wulff B, Caccamo M, Ayling S, Ramirez-Gonzalez RH, Clavijo BJ, Wright J, Pfeifer M, Spannagl M, Martis MM, Mascher M, Chapman J, Poland JA, Scholz U, Barry K, Waugh R, Rokhsar DS, Muehlbauer GJ, Stein N, Gundlach H, Zytnicki M, Jamilloux V, Quesneville H, Wicker T, Faccioli P, Colaiacovo M, Stanca AM, Budak H, Cattivelli L, Glover N, Pingault L, Paux E, Sharma S, Appels R, Bellgard M, Chapman B, Nussbaumer T, Bader KC, Rimbert H, Wang S, Knox R, Kilian A, Alaux M, Alfama F, Couderc L, Guilhot N, Viseux C, Loaec M, Keller B, Praud S.International audienceAn ordered draft sequence of the 17-gigabase hexaploid bread wheat (Triticum aestivum) genome has been produced by sequencing isolated chromosome arms. We have annotated 124,201 gene loci distributed nearly evenly across the homeologous chromosomes and subgenomes. Comparative gene analysis of wheat subgenomes and extant diploid and tetraploid wheat relatives showed that high sequence similarity and structural conservation are retained, with limited gene loss, after polyploidization. However, across the genomes there was evidence of dynamic gene gain, loss, and duplication since the divergence of the wheat lineages. A high degree of transcriptional autonomy and no global dominance was found for the subgenomes. These insights into the genome biology of a polyploid crop provide a springboard for faster gene isolation, rapid genetic marker development, and precise breeding to meet the needs of increasing food demand worldwide

    URGI plant and fungi platform: distributed resources through GMOD tools

    No full text
    International audienceNext Generation Sequencing technologies produce very large amount of data. Indeed, genomes are (re-)sequenced at high pace, and new sequences data are produced (eg. RNA-seq, Chip-seq). To face this challenge, the URGI ( http://urgi.versailles.inra.fr ) platform aims at providing tools for genomics, genetics, transcriptomics and polymorphisms comprising: pipelines, databases and user-friendly interfaces to analyze, browse and query the data. We will present plant and fungal genomic resources distributed through GMOD tools integrated in our Information System GnpIS.- Our genome module database (DB) components rely on the well-known schemas from the GMOD consortium. All annotation features and analysis results are primarily stored in the Chado or Bio::SeqFeature schema according to the need. Data can then be searched through GnpIS QuickSearch based on Apache Lucene™. Indexes are generated to query data stored in same or separate GMOD DBs. Query results are returned according to significance with terms, and linked to other GnpIS modules and/or Genome Report System (GRS). Biomart (GMOD) based datamarts were used as an advance search tool. Results of complex search criteria could be exported in different formats or directly send to our Galaxy server for further bioinformatic analysis.- We provide textual or graphical interfaces over the DBs such as GBrowse or Gbrowse_Syn to display sequence annotations or synteny respectively and Apollo for genes structure curation. The GRS provides comprehensive categories of reports through a user-friendly textual interface over structural and functional genomic data stored in Chado databases. - We also present the pipelines we developed for differential gene expression and polymorphism analysis available through our Galaxy server
    corecore