Search CORE

18 research outputs found

Phylogenomics of plant genomes: a methodology for genome-wide searches for orthologs in plants

Author: Conte Matthieu G
Droc Gaetan
Gaillard Sylvain
Perin Christophe
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Gene ortholog identification is now a major objective for mining the increasing amount of sequence data generated by complete or partial genome sequencing projects. Comparative and functional genomics urgently need a method for ortholog detection to reduce gene function inference and to aid in the identification of conserved or divergent genetic pathways between several species. As gene functions change during evolution, reconstructing the evolutionary history of genes should be a more accurate way to differentiate orthologs from paralogs. Phylogenomics takes into account phylogenetic information from high-throughput genome annotation and is the most straightforward way to infer orthologs. However, procedures for automatic detection of orthologs are still scarce and suffer from several limitations. Results We developed a procedure for ortholog prediction between <it>Oryza sativa </it>and <it>Arabidopsis thaliana</it>. Firstly, we established an efficient method to cluster <it>A. thaliana </it>and <it>O. sativa </it>full proteomes into gene families. Then, we developed an optimized phylogenomics pipeline for ortholog inference. We validated the full procedure using test sets of orthologs and paralogs to demonstrate that our method outperforms pairwise methods for ortholog predictions. Conclusion Our procedure achieved a high level of accuracy in predicting ortholog and paralog relationships. Phylogenomic predictions for all validated gene families in both species were easily achieved and we can conclude that our methodology outperforms similarly based methods.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Application du système GenFam à la réponse au stress des plantes : intégration de l'identification d'éléments cis spécifiques

Author: Bocs Stéphanie
Droc Gaetan
Dufayard Jean-François
Larivière Delphine
Lorenzo Jonathan
This Dominique
Publication venue: Université d'Auvergne
Publication date: 01/01/2015
Field of study

UMR AGAP - équipe ID - Intégration des donnéesGenFam est un système intégratif d'analyse de familles de gènes. Ce système permet (i) de créer des familles de gènes de génomes complets, (ii) d’exécuter une analyse phylogénétique de cette famille à travers le gestionnaire de workflows Galaxy afin de définir les relations d'homologie, (iii) d'étudier des événements évolutifs à partir de blocs de synténie précalculées avec le workflow SynMap de la plateforme de génomique comparative (CoGe) et (iv) d’intégrer ces résultats dans l'interface de visualisation synthétique. La première application de GenFam est d’identifier des gènes candidats pour la tolérance aux stress environnementaux. Il nécessite de mettre en évidence la présence de séquences régulatrices cis spécifiques de la réponse aux stress (de type ABRE, DRE). Dans ce contexte, nous avons besoin d’intégrer de nouveaux outils afin de découvrir et chercher des sites de fixation de facteurs de transcription (Transcription Factor Binding Sites, TFBS) dans les séquences promotrices des gènes membre de la famille étudiée. Ce workflow Galaxy va, d'une part, sélectionner les régions flanquantes en 5' ou en 3' des gènes d'intérêts selon le choix de l'utilisateur. D'autre part, les régions flanquantes sont analysées afin de découvrir et rechercher les motifs de séquences régulatrices cis spécifiques de la réponse aux stress avec des méthodes complémentaires comme MEME, STIF, PHYME. Ces résultats ainsi que l’annotation fonctionnelle des gènes étiquetés comme étant impliqués dans la réponse au stress seront intégrés dans l’interface de visualisation. Ce travail doit permettre une réflexion sur la notion d'orthologie fonctionnelle et effectuer une recherche translationnelle depuis les espèces modèles jusqu'aux espèces d'intérêt agronomique (i.e identifier des gènes candidats pour la réponse au stress du caféier à partir d'informations fonctionnelles connues chez Arabidopsis)

Evolutionary Dynamics of the Leucine-Rich Repeat Receptor-Like Kinase (LRR-RLK) Subfamily in Angiosperms

Author: Anne Diévart
Gaetan Droc
Iris Fischer
Jean-François Dufayard
Nathalie Chantret
Publication venue: 'American Society of Plant Biologists (ASPB)'
Publication date
Field of study

Crossref

Deciphering the genome structure and paleohistory of _Theobroma cacao_

Author: Ang&#xe9
Ange Marie Risterucci
Anne Dievart
Aur&#xe9
Bertrand Pitollat
Christopher Viot
Claire Lanaud
Cristian Chaparro
Dave Kudrna
Didier Clement
Diogenes Infante
Dominique Brunel
Emmanuel Guiderdoni
Erika Sallet
Florent Murat
Francis Quetier
Francois Sabot
Gaetan Droc
Ismael Kebe
Jean Marc Aury
Jerome Gouzy
Jerome Salse
Jetty Siva S. Ammiraju
John E. Carlson
Jose Fernandes Barbosa-Neto
Joseph Moroh Akaza
Julie Poulain
Karina Gramacho
Laura Gelley
Maguy Rodier-Goud
Manuel Ruiz
Mark Guiltinan
Mathias Tahi
Mathilde Allegre
Melissa Kramer
Michael Abrouk
Michael Axtell
Michel Boccara
Mickael Bourge
Olivier Fouet
Olivier Panaud
Patrick Wincker
Pierre Costet
Rod Wing
Ronan Rivalan
Schiex T.
Siela Maximova
Spencer Brown
Stephan C. Schuster
Stephanie Sidibe-Bocs
Thierry Legavre
Valentin Guignon
W. Richard McCombie
Wolfgang Golser
Xavier Argout
Xavier Sabau
Xiang Song
Yolande Roguet
Yufan Zhang
Zhaorong Ma
Zi Sh
Publication venue
Publication date: 15/09/2010
Field of study

We sequenced and assembled the genome of _Theobroma cacao_, an economically important tropical fruit tree crop that is the source of chocolate. The assembly corresponds to 76% of the estimated genome size and contains almost all previously described genes, with 82% of them anchored on the 10 _T. cacao_ chromosomes. Analysis of this sequence information highlighted specific expansion of some gene families during evolution, for example flavonoid-related genes. It also provides a major source of candidate genes for _T. cacao_ disease resistance and quality improvement. Based on the inferred paleohistory of the T. cacao genome, we propose an evolutionary scenario whereby the ten _T. cacao_ chromosomes were shaped from an ancestor through eleven chromosome fusions. The _T. cacao_ genome can be considered as a simple living relic of higher plant evolution

Crossref

Nature Precedings

Genome assembly ofMusa beccariishows extensive chromosomal rearrangements and genome expansion during evolution of Musaceae genomes

Author: Droc Gaetan
Ge Xue-Jun
Rouard Mathieu
Wang Zheng-Feng
Publication venue: Oxford Univ Press
Publication date: 28/12/2022
Field of study

Background: Musa beccarii (Musaceae) is a banana species native to Borneo, sometimes grown as an ornamental plant. The basic chromosome number of Musa species is x = 7, 10, or 11; however, M. beccarii has a basic chromosome number of x = 9 (2n = 2x = 18), which is the same basic chromosome number of species in the sister genera Ensete and Musella. Musa beccarii is in the section Callimusa, which is sister to the section Musa. We generated a high-quality chromosome-scale genome assembly of M. beccarii to better understand the evolution and diversity of genomes within the family Musaceae. Findings: The M. beccarii genome was assembled by long-read and Hi-C sequencing, and genes were annotated using both long Isoseq and short RNA-seq reads. The size of M. beccarii was the largest among all known Musaceae assemblies (∼570 Mbp) due to the expansion of transposable elements and increased 45S ribosomal DNA sites. By synteny analysis, we detected extensive genome-wide chromosome fusions and fissions between M. beccarii and the other Musa and Ensete species, far beyond those expected from differences in chromosome number. Within Musaceae, M. beccarii showed a reduced number of terpenoid synthase genes, which are related to chemical defense, and enrichment in lipid metabolism genes linked to the physical defense of the cell wall. Furthermore, type III polyketide synthase was the most abundant biosynthetic gene cluster (BGC) in M. beccarii. BGCs were not conserved in Musaceae genomes. Conclusions: The genome assembly of M. beccarii is the first chromosome-scale genome assembly in the Callimusa section in Musa, which provides an important genetic resource that aids our understanding of the evolution of Musaceae genomes and enhances our knowledge of the pangenome

HAL-CIRAD

RedOak: a reference-free and alignment-free structure for indexing a collection of similar genomes

Author: Agret Clement
Chateau Annie
Droc Gaetan
Mancheron Alban
Ruiz Manuel
Sarah Gautier
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 21/01/2021
Field of study

Background: As the cost of DNA sequencing decreases, high-throughput sequencing technologies become increasingly accessible to many laboratories. Consequently, new issues emerge that require new algorithms, including tools for indexing and compressing hundred to thousands of complete genomes.Results: This paper presents RedOak, a reference-free and alignment-free software package that allows for the indexing of a large collection of similar genomes. RedOak can also be applied to reads from unassembled genomes, and it provides a nucleotide sequence query function. This software is based on a k-mer approach and has been developed to be heavily parallelized and distributed on several nodes of a cluster. The source code of our RedOak algorithm is available at https://gitlab.info-ufr.univ-montp2.fr/DoccY/RedOak.Conclusions: RedOak may be really useful for biologists and bioinformaticians expecting to extract information from large sequence datasets

INRIA a CCSD electronic archive server

HAL-CIRAD

The cacao Criollo genome v2.0: an improved version of the genome for genetic and functional genomic studies

Author: Argout Xavier
Aury Jean-Marc
Droc Gaetan
Fouet Olivier
Labadie Karine
Lanaud Claire
Martin Guillaume
Rivals Eric
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

International audienceBackground: Theobroma cacao L., native to the Amazonian basin of South America, is an economically important fruit tree crop for tropical countries as a source of chocolate. The first draft genome of the species, from a Criollo cultivar, was published in 2011. Although a useful resource, some improvements are possible, including identifying misassemblies, reducing the number of scaffolds and gaps, and anchoring un-anchored sequences to the 10 chromosomes. Methods: We used a NGS-based approach to significantly improve the assembly of the Belizian Criollo B97-61/B2 genome. We combined four Illumina large insert size mate paired libraries with 52x of Pacific Biosciences long reads to correct misassembled regions and reduced the number of scaffolds. We then used genotyping by sequencing (GBS) methods to increase the proportion of the assembly anchored to chromosomes

HAL Evry

Crossref

INRIA a CCSD electronic archive server

Directory of Open Access Journals

Floral transition A genetic regulatory network controlling floral transition in apple

Author: Andrés Fernando
Bolbol Mohamad Al
Costes Evelyne
Droc Gaetan
Estevan Joan
Fernández Virginia
Jeong Kwanho
Khoury Samer El
Sarah Gautier
Soriano Alexandre
Publication venue: HAL CCSD
Publication date: 16/06/2024
Field of study

International audienc

HAL-CIRAD

AgroLD: A Knowledge Graph Database for plant functional genomics

Author: Chentli Imene
Droc Gaetan
El Hassouni Nordine
Guignon Valentin
Jonquet Clement
Larmande Pierre
Pitollat Bertrand
Rouard Mathieu
Ruiz Manuel
Tagny Gildas
Tando Ndomassi
Venkatesan Aravind
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 06/07/2021
Field of study

International audienceThe Explore Relationships tool aids in exploring relationships between existing entities. Quick search is based on keyword search and aids in understanding the underlying knowledge

HAL Descartes

HAL-IRD

HAL-CIRAD

A chromosome-level reference genome of Ensete glaucum gives insight into diversity and chromosomal and repetitive sequence evolution in the Musaceae

Author: Baurens Franc-Christophe
Biswas Manosh
Cui Dongli
Droc Gaetan
Ge Xue-Jun
Heslop-Harrison Pat (J S)
Liu Qing
Rouard Mathieu
Roux Nicolas
Schwarzacher Trude
Wang Ziwei
Publication venue: 'Oxford University Press (OUP)'
Publication date: 30/04/2022
Field of study

International audienceBackground Ensete glaucum (2n = 2x = 18) is a giant herbaceous monocotyledonous plant in the small Musaceae family along with banana (Musa). A high-quality reference genome sequence assembly of E. glaucum is a resource for functional and evolutionary studies of Ensete, Musaceae, and the Zingiberales. Findings Using Oxford Nanopore Technologies, chromosome conformation capture (Hi-C), Illumina and RNA survey sequence, supported by molecular cytogenetics, we report a high-quality 481.5 Mb genome assembly with 9 pseudo-chromosomes and 36,836 genes. A total of 55% of the genome is composed of repetitive sequences with predominantly LTR-retroelements (37%) and DNA transposons (7%). The single 5S ribosomal DNA locus had an exceptionally long monomer length of 1,056 bp, more than twice that of the monomers at multiple loci in Musa. A tandemly repeated satellite (1.1% of the genome, with no similar sequence in Musa) was present around all centromeres, together with a few copies of a long interspersed nuclear element (LINE) retroelement. The assembly enabled us to characterize in detail the chromosomal rearrangements occurring between E. glaucum and the x = 11 species of Musa. One E. glaucum chromosome has the same gene content as Musa acuminata, while others show multiple, complex, but clearly defined evolutionary rearrangements in the change between x= 9 and 11. Conclusions The advance towards a Musaceae pangenome including E. glaucum, tolerant of extreme environments, makes a complete set of gene alleles, copy number variation, and a reference for structural variation available for crop breeding and understanding environmental responses. The chromosome-scale genome assembly shows the nature of chromosomal fusion and translocation events during speciation, and features of rapid repetitive DNA change in terms of copy number, sequence, and genomic location, critical to understanding its role in diversity and evolution