138 research outputs found

    The cocoa genome hub, an integrated platform to access the Criollo genome V2

    Full text link
    The first draft genome of the species, from the Belizian Criollo B97-61/B2 cultivar, was published in 2011. Although a useful resource, some improvements were possible, including to identify misassemblies, to reduce the number of scaffolds and gaps, and to anchor un-anchored sequences to the 10 chromosomes. In 2017, we used a combination Next Generation Sequencing data to produce the version 2 of the assembly. We corrected misassembled regions and reduced the number of scaffolds from 4,792 in assembly V1 to 554 in V2 with a N50 increased from 0.47 Mb in V1 to 6.5 Mb in V2. A total of 96.7% of the assembly was anchored to the 10 chromosomes compared to 66.8% in the previous version. Unknown sites (Ns) were reduced from 10.8% to 5.7%. In addition, we updated the functional annotations and performed a new RefSeq structural annotation based on RNAseq evidence. In that context and to support post-genomics efforts, we developed the Cocoa Genome Hub (http://cocoagenome- hub.southgreen.fr/), an integrated web-based database providing centralized access to T. cacao genome and analysis tools to facilitate basic, translational and applied research in cocoa. We provide access to the complete criollo genome sequence V2 along with gene structure, gene product information, metabolism, gene families, transcriptomics (ESTs, RNA-Seq), genetic markers and genetic maps. The hub relies on generic software (e.g. GMOD tools) for easy querying, visualizing and downloading research data. It includes a Genome Browser enhanced by a Community Annotation System, enabling the improvement of automatic gene annotation through an annotation editor. (Résumé d'auteur

    GNPAnnot community annotation system applied to sugarcane bac clone sequences (W572)

    Full text link
    A large amount of data is being produced by current genome sequencing projects. Sequence annotations and analyses need to be organized into databases and widely accessible. Like other species, sugarcane would benefit from centralized and innovative systems to study its genome. The GNPAnnot community annotation system (CAS) could be particularly relevant to the SUGESI sequencing project. It consists in a system for structural and functional annotations supported by comparative genomics allowing both automatic predictions and manual curations of genes and transposable elements. The core of the GNPAnnot CAS dedicated to tropical plants is made of GMOD components.The Chado database can be browsed using the Generic Genome Browser (GBrowse) which provides links to genome editors (ie. Artemis and Apollo). We developed the Chado controller in order to manage public and private annotation projects. It also provides an annotation history page for each gene or transposable element and an annotation inspector that automates several tasks and reports annotation mistakes. GNPAnnot CAS has already been used to annotate sugarcane BAC clones sequences and could be useful to facilitate the annotation of novel sugarcane sequences. (Résumé d'auteur

    Evolutionary dynamics of hom(oe)ologous chromosome segments within the highly polyploid sugarcane genome

    Full text link
    Modern sugarcane (Saccharum spp.) is the leading sugar crop and a primary energy crop. It presents one the most complex crop genome studied to date, mainly due to the very high level of vertical redundancy (2n = ca 12x = ca 120 = 10 Gb), together with an interspecific origin. Modern cultivars are derived from hybridization, performed by breeders a century ago, between two autopolyploid species, namely S. officinarum (domesticated, 2n=8x=80) and S. spontaneum (wild species, 2n=5x=40 to 16x=128). To investigate genome dynamics in this highly polyploid context, we sequenced and analyzed the structural organization of hom(oe)ologous chromosome segments (bacterial artificial chromosome clones) from a few regions the sugarcane cultivar R570. For all regions, almost perfect gene colinearity and high gene structure and sequence conservation were observed. Moreover, the vast majority of the homoeologous genes were predicted, based on their structure, to be functional and showed signs of evolving under purifying selection. Compared to sorghum, the sugarcane haplotypes displayed a high gene colinearity. By contrast, transposable elements displayed a general absence of colinearity among hom(oe)ologous haplotypes Our data suggest the presence of broad sets of functional homologous alleles in its genome, which could explain its unique efficiency, particularly its high phenotypic plasticity and wide adaptation. (Résumé d'auteur

    Comparison of hom(oe)ologous chromosome segments in the highly polyploid interspecific genome of sugarcane

    Full text link
    Modern sugarcane cultivars (Saccharum spp.) present one of the most complex crop genome studied to date, mainly due to a very high degree of polyploidy (2n = 12x = 120), and an interspecific origin from two autopolyploid species, namely S. officinarum and S. spontaneum. To investigate the impact of polyploidization on the sugarcane genome organization and more widely on its performance and plasticity, we finely analyzed the structural organization of hom(oe)ologous chromosome segments. Twenty-seven homoeologous BAC clones from three distinct regions, carrying the genes Adh1 (13 hom(oe)ologous chromosome segments), PST2a (10 hom(oe)ologous chromosome segments) and CAD2 (4 hom(oe)ologous chromosome segments), were identified, sequenced, finely annotated and compared, representing more than 2.5 Mb of sugarcane DNA sequence. A very high gene colinearity, gene structure and sequence conservation (98.1% of average nucleotide sequence identity for the coding sequence, and 93.3% for the aligned part of the introns) was observed among all hom(oe)ologous chromosome segments, confirming preliminary observations. Based on their structure, the homoeologous genes were predicted to be functional and the vast majority of them showed signs of evolving under purifying selection. Colinearity between hom(oe)ologous chromosomes was also extended to many intergenic regions and transposable elements. Divergence between hom(oe)ologous genes and patterns of transposable element insertions are currently being analyzed in order to infer the origin (S. officinarum vs S. spontaneum) of the chromosome segments. The high level of gene colinearity and structure conservation has implication regarding whole genome sequencing strategy of this complex genome, since it suggests that one chromosome segment could serve as reference for the other hom(oe)ologous chromosome segments regarding gene content. The maintenance of a broad set of functional alleles, that we described, may be involved in the high phenotypic plasticity and wide adaptation of sugarcane. (Résumé d'auteur

    Detailed analyses of 12 hom(oe)ologous chromosome segments in the highly polyploid sugarcane genome

    Full text link
    Modern sugarcane cultivars (Saccharum spp.) are recognized as the crop with the most complex genome studied to date, mainly due to the very high level of vertical redundancy (2n = ca 12x = ca 120), together with an interspecific origin. They are derived from hybridization, performed by breeders a century ago, between two autopolyploid species, namely S. officinarum (domesticated) and S. spontaneum (wild species, 2n=5x=40 to 16x=128). To investigate the impact of polyploidization on its genome organization and more widely on its performance and plasticity, we finely analyzed the structural organization of hom(oe)ologous chromosomes. Thirty-three homoeologous BAC clones from four regions of the sugarcane R570 genome were identified, sequenced, finely annotated and compared, representing more than 3 Mb of sugarcane DNA sequence. For all four regions, almost perfect gene colinearity and high gene structure and sequence conservation were observed, confirming previous preliminary analyses on two of these regions. Moreover, the vast majority of the homoeologous genes were predicted, based on their structure, to be functional and showed signs of evolving under purifying selection. For one of the region carrying the Adh1 gene, we extended the homoeologous series to 13 hom(oe)ologous chromosome segments. Gene similarity and patterns of transposable element insertions are currently being analyzed in order to determine the origin (S. officinarum vs S. spontaneum) and the evolutionary dynamics of these hom(oe)ologous regions. (Résumé d'auteur

    Orylink: A personalized integrated system for functional genomic analysis : [Abstract CP842]

    Full text link
    Plant functional genomics requires data integration from several sources. A classical example is the need for cross-references between gene location and the corresponding mutant lines, a feature already present in reverse genetic databases like OryGenesDB or T-DNA express. We recently developed three plant databases specifically designed for rice functional genomics: OryGenesDB, OryzaTagLine, and GreenPhylDB. OryGenesDB is a reverse genetics and genomic database and works together with OryzaTagLine, which contains the corresponding phenotypic description of the mutant lines. GreenPhylDB is designed for comparative functional genomics and links the two model plant species Oryza sativa and Arabidopsis thaliana through ortholog predictions. We developed Orylink to run web queries on remote databases. Using Orylink, biologists can speed up information retrieval across these three databases including FST, mutant phenotypes and Arabidopsis orthologs. The interface supports user logins and profiles. Any user can personalize the system using specific forms to display relevant information synthesized from many data sources. Furthermore, we developed and registered some Web services on the BioMOBY registry that can be used to retrieve genomic location, gene information, germplasm name, phenotype description, and information on Arabidopsis thaliana and rice gene orthologs independently of Orylink. The application is available with many other tools at http://orygenesdb.cirad.fr. (Texte intégral

    High homologous gene conservation despite extreme autopolyploid redundancy in sugarcane

    Full text link
    Modern sugarcane (Saccharum spp.) has been recognized as one of the world's most efficient crops in solar energy conversion and as havlng the mast favorable input: output ratios. Beside ils importance for sugar production, ilthus became recently aprimary energy crop. Sugarcane also presents 01)6 the most complex crop genome studied to date, rnainly due to avery high degree of polyploidy (2n=ca 12x=ca 120), together wiltl an inlerspecffic origin. In arder to invesligate genome dynamics in this highly polyploid context and 10 provide guidelines for future whole genome sequencing project, we sequenced and compared seven homoeologous haplotypes (BAC clones). Our analysis revealed ahigh conservation at the gene level (high colinearlty and high gene structure and sequence conservation). Remarkably, ail homoeo-alleles are predicted functional and no apparent general decrease of purilying selection was observed. Thus Ihe high polyploldy of sugarcane does not seem to have induced a major reshaping of ils genome, alleast at the gene levaI. By contras~ transposable elements displayed ageneral absence 01 colinearily among homoeologous haplotypes and appeared fo have uooergone dynamic expansion in Saccharum, compared Vlith sorghum, its clOse relative ln the Andropogonea tribe. Our data sugges! lt1e presence of broad sets of funcllonal homologous alleles in the sugarcane genome, which could explain ,. ils unique efficiency, ils hlgh phenolypic plaslicily and wide adaptation. (Texte intégral

    Exploring predicted musa genes using the greenphyl comparative genomics database : W078

    Full text link
    With the increasing number of plant genomes being sequenced, a major challenge is to accurately transfer annotation from well characterized genomes to newly obtained sequences. GreenPhylDB is a database designed for comparative and functional genomics based on complete genome-derived gene sequences (Conte et al, 2008, Rouard M, Guignon V et al, 2011). The database currently includes gene sequences from 22 plant species, including Musa (representative of bananas and plantains). Genes from all these species are organized in clusters based on sequence similarity. The clusters (or families) are manually annotated (i.e. properly named and classified) and sequences included in each cluster are characterized by phylogenetic analysis in order to elucidate evolutionary relationships (e.g. orthologs, super-orthologs, in/out-paralogs) among genes. GreenPhyl provides a reliable (Martinez, 2011) and stable catalog of gene families useful for annotation on new genome sequences in plants. GreenPhyl has been particularly useful for studying the transcription factors of the Musa acuminata (Doubled Haploid Pahang) genome sequence recently published (D'hont et al, 2012). With its improved user interface, the new release of GreenPhyl (available at http://www.greenphyl.org) keeps the previous gene clustering quality and introduces additional features such as specific search engines (quick search, deep search, InterPro domain combination and GO family browser). This talk will present the latest development of the GreenPhyl version 3 and will give a few examples of gene family analyses in Musa. (Résumé d'auteur

    GNPAnnot: a community annotation system applied to sugarcane sequences : W745

    Full text link
    A large amount of data is being produced by current genome sequencing projects. Sequence annotations need to be organized into databases and widely accessible. Like other species, sugarcane would benefit from centralized and innovative systems to study its genome. GNPAnnot is a community system performing structural and functional annotations of genes and allowing both automatic predictions and manual curations of genes and transposable elements (TEs). The system is currently being used for various plants, insect and fungus species. The GNPAnnot pipeline is made of a collection of programs that are connected together to automate genomic sequence annotations. Sequences and results are stored into the Chado GMOD database and can be visualized through a genome browser accessible from the Web portal of the SouthGreen bioinformatic platform (http://southgreen.cirad.fr/). Annotations can be manually edited using the Artemis genome editor. A database controller has been developed (Chado controller) in order to manage public and private annotation projects. It also provides an annotation history page for each gene or TE, and an annotation inspector that reports manual annotation mistakes. The GNPAnnot system is currently being used to annotate sugarcane BAC sequences in the framework of the SUGESI (Sugarcane Genome Sequencing Initiative) that aims at sequencing around 5,000 BACs, from cultivar R570, corresponding to the gene rich part of a monoploid genome of sugarcane. The GNPAnnot system has been developed by partners of CIRAD, INRA and Bioversity and has been supported by the French National Research Agency and the Genoplante joint program. (Résumé d'auteur
    corecore