10 research outputs found

    Deciphering the genome structure and paleohistory of _Theobroma cacao_

    Get PDF
    We sequenced and assembled the genome of _Theobroma cacao_, an economically important tropical fruit tree crop that is the source of chocolate. The assembly corresponds to 76% of the estimated genome size and contains almost all previously described genes, with 82% of them anchored on the 10 _T. cacao_ chromosomes. Analysis of this sequence information highlighted specific expansion of some gene families during evolution, for example flavonoid-related genes. It also provides a major source of candidate genes for _T. cacao_ disease resistance and quality improvement. Based on the inferred paleohistory of the T. cacao genome, we propose an evolutionary scenario whereby the ten _T. cacao_ chromosomes were shaped from an ancestor through eleven chromosome fusions. The _T. cacao_ genome can be considered as a simple living relic of higher plant evolution

    Role of bioinformatics as a tool

    No full text
    © 2012 by Taylor & Francis Group, LLC. Bioinformatics plays an essential role in today’s plant science mainly due to the exponential growth of genomic sequences generated by high-throughput sequencing technologies. The success of genomics has also fostered the emergence of complementary “omics” research areas and has led to the diversification of data. In this context, various approaches, software and databases have been developed to transform biological data into meaningful information, and some of them are used on a daily basis by scientists. Compared to model plant species, Musa is still at its early stage but useful tools have been established and are ready for much larger datasets that are expected in the near future. In this chapter, we introduce the resources and tools available to support Musa research, and bioinformatics topics such as gene annotation, transcriptomics, proteomics, and data integration are addressed.status: publishe

    Transcriptome data from three endemic Myrtaceae species from New Caledonia displaying contrasting responses to myrtle rust (Austropuccinia psidii)

    Get PDF
    The myrtle rust disease, caused by the fungus Austropuccinia psidii, infects a wide range of host species within the Myrtaceae family worldwide. Since its first report in 2013 in New Caledonia, it was found on various types of native environments where Myrtaceae are the dominant or codominant species, as well as in several commercial nurseries. It is now considered as a significant threat to ecosystems biodiversity and Myrtaceae-related economy. The use of predictive molecular markers for resistance against myrtle rust is currently the most cost-effective and ecological approach to control the disease. Such an approach for neo Caledonian endemic Myrtaceae species was not possible because of the lack of genomic resources. The recent advancement in new generation sequencing technologies accompanied with relevant bioinformatics tools now provide new research opportunity for work in non-model organism at the transcriptomic level. The present study focuses on transcriptome analysis on three Myrtaceae species endemic to New Caledonia (Arillastrum gummiferum, Syzygium longifolium and Tristaniopsis glauca) that display contrasting responses to the pathogen (non-infected vs infected). Differential gene expression (DGE) and variant calling analysis were conducted on each species. We combined a dual approach by using 1) the annotated reference genome of a related Myrtaceae species (Eucalyptus grandis) and 2) a de novo transcriptomes of each species

    High homologous gene conservation despite extreme autopolyploid redundancy in sugarcane

    No full text
    P>Modern sugarcane (Saccharum spp.) is the leading sugar crop and a primary energy crop. It has the highest level of `vertical` redundancy (2n = 12x = 120) of all polyploid plants studied to date. It was produced about a century ago through hybridization between two autopolyploid species, namely S. officinarum and S. spontaneum. In order to investigate the genome dynamics in this highly polyploid context, we sequenced and compared seven hom(oe)ologous haplotypes (bacterial artificial chromosome clones). Our analysis revealed a high level of gene retention and colinearity, as well as high gene structure and sequence conservation, with an average sequence divergence of 4% for exons. Remarkably, all of the hom(oe)ologous genes were predicted as being functional (except for one gene fragment) and showed signs of evolving under purifying selection, with the exception of genes within segmental duplications. By contrast, transposable elements displayed a general absence of colinearity among hom(oe)ologous haplotypes and appeared to have undergone dynamic expansion in Saccharum, compared with sorghum, its close relative in the Andropogonea tribe. These results reinforce the general trend emerging from recent studies indicating the diverse and nuanced effect of polyploidy on genome dynamics.Genoscope[AAP2005]GenoscopeCiradCira

    Ten steps to get started in Genome Assembly and Annotation

    No full text
    As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic properties of genomes are discussed, as is the importance of using high quality DNA. Different sequencing technologies and generally applicable workflows for genome assembly are also detailed. We cover structural and functional annotation and encourage readers to also annotate transposable elements, something that is often omitted from annotation workflows. The importance of data management is stressed, and we give advice on where to submit data and how to make your results Findable, Accessible, Interoperable, and Reusable (FAIR).ELIXIR-EXCELERATE is funded by the European Commission within the Research Infrastructures Programme of Horizon 2020 [676559]

    The genome sequence of the entomopathogenic bacterium Photorhabdus luminescens

    No full text
    50 ref.International audiencePhotorhabdus luminescens is a symbiont of nematodes and a broad-spectrum insect pathogen. The complete genome sequence of strain TT01 is 5,688,987 base pairs (bp) long and contains 4,839 predicted protein-coding genes. Strikingly, it encodes a large number of adhesins, toxins, hemolysins, proteases and lipases, and contains a wide array of antibiotic synthesizing genes. These proteins are likely to play a role in the elimination of competitors, host colonization, invasion and bioconversion of the insect cadaver, making P. luminescens a promising model for the study of symbiosis and host-pathogen interactions. Comparison with the genomes of related bacteria reveals the acquisition of virulence factors by extensive horizontal transfer and provides clues about the evolution of an insect pathogen. Moreover, newly identified insecticidal proteins may be effective alternatives for the control of insect pests

    The coffee genome provides insight into the convergent evolution of caffeine biosynthesis

    No full text
    Coffee is a valuable beverage crop due to its characteristic flavor, aroma, and the stimulating effects of caffeine. We generated a high-quality draft genome of the species Coffea canephora, which displays a conserved chromosomal gene order among asterid angiosperms. Although it shows no sign of the whole-genome triplication identified in Solanaceae species such as tomato, the genome includes several species-specific gene family expansions, among them N-methyltransferases (NMTs) involved in caffeine production, defense-related genes, and alkaloid and flavonoid enzymes involved in secondary compound synthesis. Comparative analyses of caffeine NMTs demonstrate that these genes expanded through sequential tandem duplications independently of genes from cacao and tea, suggesting that caffeine in eudicots is of polyphyletic origin

    South Green bioinformatics platform: Update 2015

    No full text
    International audienceIn 2015, the South Green Bioinformatics Platform http://www.southgreen.fr/ is a network of 35 bioinformaticians from five biology research institutes working with two High - Performance Computing Data Centres to develop and use new tools for NGS/ Omic analytics of tropical and Mediterranean crops under projects studying relationsh ip between genetic diversity, agronomic performance and response to selection. South Green is affiliated to the South regional centre of the French Institute of Bioinformatics (the French node of the European research infrastructure, ELIXIR). This communit y and the HPC data centres are all located in Montpellier, which facilitates close collaboration and significant pooling to best meet the biologists' demands of our research units. Since 2004, we developed web - based applications with both generic and in - ho use components, for databases, analysis workflows and web interfaces, in order to: manage genetic and phenotypic information ( e.g. TropGeneDB), analyse molecular markers and genetic diversity ( e.g. SNiPlay), assemble transcriptomes ( e.g. ESTtik) map RNA - Se q ( e.g. ARCAD), annotate and compare genomes ( e.g. GNPAnnot), reconstruct evolutionary history of gene families by phylogenomics ( e.g. GreenPhyl). We also participate to the analysis of numerous crop species, that requires computing and storage facilities as well as interoperable information systems, such as rice ( e.g. OryGenesDB), wheat, sorghum, sugarcane, banana (Banana Genome Hub), palms, yam, coffee (CGH), rubber, cacao (CocoaGenDB), cotton, apple, grapevine, olive, eucalyptus, cassava. To face the dat a deluge, we must increase our analytics capabilities. We document our operation at both, administrator/ developer and user/ scientist level, to provide high quality services and reproducible research. We pool into working groups on key themes such as GBS, at both, developer (extreme pair programming) and user (interdisciplinary knowledge exchange) level. We provide training sessions each year. Finally, we implemented several instances of the Galaxy workflow manager and encapsulated our tools. These instanc es serve as a catalyst for massive NGS analyses but it remains to increase storage capacity and improve data management plans. (Résumé d'auteur

    The sugarcane genome challenges: Strategies for sequencing a highly complex genome

    No full text
    Sugarcane cultivars derive from interspecific hybrids obtained by crossing Saccharum officinarum and Saccharum spontaneum and provide feedstock used worldwide for sugar and biofuel production. The importance of sugarcane as a bioenergy feedstock has increased interest in the generation of new cultivars optimised for energy production. Cultivar improvement has relied largely on traditional breeding methods, which may be limited by the complexity of inheritance in interspecific polyploid hybrids, and the time-consuming process of selection of plants with desired agronomic traits. In this sense, molecular genetics can assist in the process of developing improved cultivars by generating molecular markers that can be used in the breeding process or by introducing new genes into the sugarcane genome. For meeting each of these, and additional goals, biotechnologists would benefit from a reference genome sequence of a sugarcane cultivar. The sugarcane genome poses challenges that have not been addressed in any prior sequencing project, due to its highly polyploid and aneuploid genome structure with a complete set of homeologous genes predicted to range from 10 to 12 copies (alleles) and to include representatives from each of two different species. Although sugarcane's monoploid genome is about 1 Gb, its highly polymorphic nature represents another significant challenge for obtaining a genuine assembled monoploid genome. With a rich resource of expressed-sequence tag (EST) data in the public domain, the present article describes tools and strategies that may aid in the generation of a reference genome sequence

    The genome of Theobroma cacao.

    No full text
    We sequenced and assembled the draft genome of Theobroma cacao, an economically important tropical-fruit tree crop that is the source of chocolate. This assembly corresponds to 76% of the estimated genome size and contains almost all previously described genes, with 82% of these genes anchored on the 10 T. cacao chromosomes. Analysis of this sequence information highlighted specific expansion of some gene families during evolution, for example, flavonoid-related genes. It also provides a major source of candidate genes for T. cacao improvement. Based on the inferred paleohistory of the T. cacao genome, we propose an evolutionary scenario whereby the ten T. cacao chromosomes were shaped from an ancestor through eleven chromosome fusions
    corecore