Search CORE

43 research outputs found

TC-motifs at the TATA-box expected position in plant genes: a novel class of motifs involved in the transcription regulation

Author: Bernard Virginie
Brunaud Véronique
Lecharny Alain
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The TATA-box and TATA-variants are regulatory elements involved in the formation of a transcription initiation complex. Both have been conserved throughout evolution in a restricted region close to the Transcription Start Site (TSS). However, less than half of the genes in model organisms studied so far have been found to contain either one of these elements. Indeed different core-promoter elements are involved in the recruitment of the TATA-box-binding protein. Here we assessed the possibility of identifying novel functional motifs in plant genes, sharing the TATA-box topological constraints. Results We developed an <it>ab-initio </it>approach considering the preferential location of motifs relative to the TSS. We identified motifs observed at the TATA-box expected location and conserved in both <it>Arabidopsis thaliana </it>and <it>Oryza sativa </it>promoters. We identified TC-elements within non-TA-rich promoters 30 bases upstream of the TSS. As with the TATA-box and TATA-variant sequences, it was possible to construct a unique distance graph with the TC-element sequences. The structural and functional features of TC-element-containing genes were distinct from those of TATA-box- or TATA-variant-containing genes. <it>Arabidopsis thaliana </it>transcriptome analysis revealed that TATA-box-containing genes were generally those showing relatively high levels of expression and that TC-element-containing genes were generally those expressed in specific conditions. Conclusions Our observations suggest that the TC-elements might constitute a class of novel regulatory elements participating towards the complex modulation of gene expression in plants.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Unique genes in plants: specificities and conserved features throughout evolution

Author: Armisén David
Aubourg Sébastien
Lecharny Alain
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Plant genomes contain a high proportion of duplicated genes as a result of numerous whole, segmental and local duplications. These duplications lead up to the formation of gene families, which are the usual material for many evolutionary studies. However, all characterized genomes include single-copy (unique) genes that have not received much attention. Unlike gene duplication, gene loss is not an unspecific mechanism but is rather influenced by a functional selection. In this context, we have established and used stringent criteria in order to identify suitable sets of unique genes present in plant proteomes. Comparisons of unique genes in the green phylum were used to characterize the gene and protein features exhibited by both conserved and species-specific unique genes. Results We identified the unique genes within both <it>A. thaliana </it>and <it>O. sativa </it>genomes and classified them according to the number of homologs in the alternative species: none (U{1:0}), one (U{1:1}) or several (U{1:m}). Regardless of the species, all the genes in these groups present some conserved characteristics, such as small average protein size and abnormal intron number. In order to understand the origin and function of unique genes, we further characterized the U{1:1} gene pairs. The possible involvement of sequence convergence in the creation of U{1:1} pairs was discarded due to the frequent conservation of intron positions. Furthermore, an orthology relationship between the two members of each U{1:1} pair was strongly supported by a high conservation in the protein sizes and transcription levels. Within the promoter of the unique conserved genes, we found a number of TATA and TELO boxes that specifically differed from their mean number in the whole genome. Many unique genes have been conserved as unique through evolution from the green alga <it>Ostreococcus lucimarinus </it>to higher plants. Plant unique genes may also have homologs in bacteria and we showed a link between the targeting towards plastids of proteins encoded by plant nuclear unique genes and their homology with a bacterial protein. Conclusion Many of the <it>A. thaliana </it>and <it>O. sativa </it>unique genes are conserved in plants for which the ancestor diverged at least 725 million years ago (MYA). Half of these genes are also present in other eukaryotic and/or prokaryotic species. Thus, our results indicate that (i) a strong negative selection pressure has conserved a number of genes as unique in genomes throughout evolution, (ii) most unique genes are subjected to a low divergence rate, (iii) they have some features observed in housekeeping genes but for most of them there is no functional annotation and (iv) they may have an ancient origin involving a possible gene transfer from ancestral chloroplasts or bacteria to the plant nucleus.</p

HAL Evry

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ProdInra

Genes of the most conserved WOX clade in plants affect root and flower development in Arabidopsis

Author: Claisse Gaelle
Deveaux Yves
Kreis Martin
Laufs Patrick
Lecharny Alain
Moreau Hervé
Morin Halima
Thareau Vincent
Toffano-Nioche Claire
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Background: The Wuschel related homeobox (WOX) family proteins are key regulators implicated in the determination of cell fate in plants by preventing cell differentiation. A recent WOX phylogeny, based on WOX homeodomains, showed that all of the Physcomitrella patens and Selaginella moellendorffii WOX proteins clustered into a single orthologous group. We hypothesized that members of this group might preferentially share a significant part of their function in phylogenetically distant organisms. Hence, we first validated the limits of the WOX13 orthologous group (WOX13 OG) using the occurrence of other clade specific signatures and conserved intron insertion sites. Secondly, a functional analysis using expression data and mutants was undertaken. Results: The WOX13 OG contained the most conserved plant WOX proteins including the only WOX detected in the highly proliferating basal unicellular and photosynthetic organism Ostreococcus tauri. A large expansion of the WOX family was observed after the separation of mosses from other land plants and before monocots and dicots have arisen. In Arabidopsis thaliana, AtWOX13 was dynamically expressed during primary and lateral root initiation and development, in gynoecium and during embryo development. AtWOX13 appeared to affect the floral transition. An intriguing clade, represented by the functional AtWOX14 gene inside the WOX13 OG, was only found in the Brassicaceae. Compared to AtWOX13, the gene expression profile of AtWOX14 was restricted to the early stages of lateral root formation and specific to developing anthers. A mutational insertion upstream of the AtWOX14 homeodomain sequence led to abnormal root development, a delay in the floral transition and premature anther differentiation. Conclusion: Our data provide evidence in favor of the WOX13 OG as the clade containing the most conserved WOX genes and established a functional link to organ initiation and development in Arabidopsis, most likely by preventing premature differentiation. The future use of Ostreococcus tauri and Physcomitrella patens as biological models should allow us to obtain a better insight into the functional importance of WOX13 OG genes

Crossref

Springer - Publisher Connector

PubMed Central

HAL Descartes

ProdInra

Exploration of plant genomes in the FLAGdb++ environment

Author: Aubourg Sébastien
Brunaud Véronique
Dèrozier Sandra
Gagnot Séverine
Grevet Philippe
Guichard Cécile
Label Philippe
Lecharny Alain
Leplé Jean-Charles
Samson Franck
Tamby Jean-Philippe
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background : In the contexts of genomics, post-genomics and systems biology approaches, data integration presents a major concern. Databases provide crucial solutions: they store, organize and allow information to be queried, they enhance the visibility of newly produced data by comparing them with previously published results, and facilitate the exploration and development of both existing hypotheses and new ideas. Results : The FLAGdb++ information system was developed with the aim of using whole plant genomes as physical references in order to gather and merge available genomic data from in silico or experimental approaches. Available through a JAVA application, original interfaces and tools assist the functional study of plant genes by considering them in their specific context: chromosome, gene family, orthology group, co-expression cluster and functional network. FLAGdb++ is mainly dedicated to the exploration of large gene groups in order to decipher functional connections, to highlight shared or specific structural or functional features, and to facilitate translational tasks between plant species (Arabidopsis thaliana, Oryza sativa, Populus trichocarpa and Vitis vinifera). Conclusion : Combining original data with the output of experts and graphical displays that differ from classical plant genome browsers, FLAGdb++ presents a powerful complementary tool for exploring plant genomes and exploiting structural and functional resources, without the need for computer programming knowledge. First launched in 2002, a 15th version of FLAGdb++ is now available and comprises four model plant genomes and over eight million genomic features

HAL Evry

Crossref

Springer - Publisher Connector

PubMed Central

HAL Descartes

ProdInra

Analysis of CATMA transcriptome data identifies hundreds of novel functional genes and improves gene models in the Arabidopsis genome

Author: Aubourg Sébastien
Balzergue Sandrine
Bitton Frédérique
Brunaud Véronique
Ingouff Mathieu
Jullien Pauline E
Lecharny Alain
Martin-Magniette Marie-Laure
Renou Jean-Pierre
Schiex Thomas
Taconnat Ludivine
Thareau Vincent
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Since the finishing of the sequencing of the <it>Arabidopsis thaliana </it>genome, the Arabidopsis community and the annotator centers have been working on the improvement of gene annotation at the structural and functional levels. In this context, we have used the large CATMA resource on the Arabidopsis transcriptome to search for genes missed by different annotation processes. Probes on the CATMA microarrays are specific gene sequence tags (GSTs) based on the CDS models predicted by the Eugene software. Among the 24 576 CATMA v2 GSTs, 677 are in regions considered as intergenic by the TAIR annotation. We analyzed the cognate transcriptome data in the CATMA resource and carried out data-mining to characterize novel genes and improve gene models. Results The statistical analysis of the results of more than 500 hybridized samples distributed among 12 organs provides an experimental validation for 465 novel genes. The hybridization evidence was confirmed by RT-PCR approaches for 88% of the 465 novel genes. Comparisons with the current annotation show that these novel genes often encode small proteins, with an average size of 137 aa. Our approach has also led to the improvement of pre-existing gene models through both the extension of 16 CDS and the identification of 13 gene models erroneously constituted of two merged CDS. Conclusion This work is a noticeable step forward in the improvement of the Arabidopsis genome annotation. We increased the number of Arabidopsis validated genes by 465 novel transcribed genes to which we associated several functional annotations such as expression profiles, sequence conservation in plants, cognate transcripts and protein motifs.</p

HAL Evry

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

HAL Descartes

ProdInra

Sélection de variables pour la classification par mélanges gaussiens pour prédire la fonction des gènes orphelins

Author: Aubourg Sebastien
Celeux Gilles
Lecharny Alain
Martin-Magniette Marie-Laure
Maugis Cathy
Renou Jean-Pierre
Tamby Jean-Philippe
Publication venue: Modulad
Publication date: 01/01/2009
Field of study

Biologists are interested in predicting the gene functions of sequenced genome organisms according to microarray transcriptome data. The microarray technology development allows one to study the whole genome in different experimental conditions. The information abundance may seem to be an advantage for the gene clustering. However, the structure of interest can often be contained in a subset of the available variables. The currently available variable selection procedures in model-based clustering assume that the irrelevant clustering variables are all independent or are all linked with the relevant clustering variables. A more versatile variable selection model is proposed, taking into account three possible roles for each variable: The relevant clustering variables, the redundant variables and the independent variables. A model selection criterion and a variable selection algorithm are derived for this new variable role modelling. The interest of this new modelling for discovering the function of orphan genes is highlighted on a transcriptome dataset for the arabidopsis thaliana plant.Les biologistes s’attachent actuellement à prédire la fonction des gènes d’organismes de génome séquence à partir de données transcriptomes, issues de l’utilisation des puces à ADN. Le d´développement de cette technologie permet de tester l’expression de l’ensemble du génome dans de nombreuses conditions expérimentales. Cette quantité d’information peut alors sembler être un atout pour la classification des gènes. Pourtant il est courant que seul un sous-ensemble contienne l’information pertinente pour la classification. Les procédures de sélection des variables en classification non supervisée par mélanges gaussiens supposent généralement que les variables non informatives pour la classification sont soit toutes indépendantes, soit liées à des variables informatives. Nous proposons une nouvelle modélisation du rôle des variables plus polyvalente : les variables sont soit informatives pour la classification, soit redondantes, soit totalement indépendantes. Nous proposons un critère de sélection des variables et un algorithme pour cette nouvelle modélisation. L’intérêt de cette nouvelle modélisation pour la prédiction de la fonction des gènes orphelins est illustrée sur un ensemble de données transcriptomes obtenues chez Arabidopsis thaliana

HAL Evry

INRIA a CCSD electronic archive server

ProdInra

Hal-Diderot

GeneFarm, structural and functional annotation of Arabidopsis gene and protein families by a network of experts

Genomic projects heavily depend on genome annotations and are limited by the current deficiencies in the published predictions of gene structure and function. It follows that, improved annotation will allow better data mining of genomes, and more secure planning and design of experiments. The purpose of the GeneFarm project is to obtain homogeneous, reliable, documented and traceable annotations for Arabidopsis nuclear genes and gene products, and to enter them into an added-value database. This re-annotation project is being performed exhaustively on every member of each gene family. Performing a family-wide annotation makes the task easier and more efficient than a gene-by-gene approach since many features obtained for one gene can be extrapolated to some or all the other genes of a family. A complete annotation procedure based on the most efficient prediction tools available is being used by 16 partner laboratories, each contributing annotated families from its field of expertise. A database, named GeneFarm, and an associated user-friendly interface to query the annotations have been developed. More than 3000 genes distributed over 300 families have been annotated and are available at http://genoplante-info.infobiogen.fr/Genefarm/. Furthermore, collaboration with the Swiss Institute of Bioinformatics is underway to integrate the GeneFarm data into the protein knowledgebase Swiss-Pro

RERO DOC Digital Library

GeneFarm, structural and functional annotation of Arabidopsis gene and protein families by a network of experts

HAL-ENS-LYON

HAL Evry

Crossref

Hal - Université Grenoble Alpes

HAL Clermont Université

Ghent University Academic Bibliography

Relations entre l'organisation des sites de fixation des facteurs de transcription, la fonction des gènes et l'expression des gènes (vers une annotation des sites de fixation chez Arabidopsis thaliana)

Author: BERNARD Virginie
LECHARNY Alain
Publication venue
Publication date: 01/01/2009
Field of study

Les sites de fixation des facteurs de transcription ou éléments régulateurs sont impliqués dans la régulation de l'expression des gènes. Une meilleure connaissance de l'architecture des promoteurs est aujourd'hui accessible via l annotation des génomes et les données transcriptomiques. Certains éléments régulateurs sont conservés à une position préférentielle dans les promoteurs. Chez A. thaliana, nous avons mis au point une approche pour caractériser de tels motifs. Ce travail a permis de proposer une cartographie des promoteurs en identifiant 5105 motifs caractérisés par une sur-représentation locale dans les promoteurs proximaux. L étude du promoteur central où est observée la boîte TATA, élément régulateur conservé entre eucaryotes, a été approfondie. Une liste de 15 variants fonctionnels de la boîte TATA a été identifiée, ainsi qu une nouvelle classe d éléments régulateurs qui sont caractérisés par des mêmes contraintes topologiques que la boîte TATA: les motifs-TC. Ils sont conservés chez A. thaliana et O. sativa, mais absents chez les mammifères. Les 18% de gènes d A. thaliana contenant un motif-TC ont tendance à être exprimés dans des conditions expérimentales spécifiques. Ces éléments pourraient participer à la régulation de l expression des gènes. L étude de l élément initiateur YR chez A. thaliana a mis en évidence une extension de ces 4 dinucléotides dans l UTR 5 . Des associations entre ces éléments régulateurs peuvent montrer une collaboration fonctionnelle. La recherche de caractéristiques fonctionnelles communes aux gènes possédant une même organisation d'éléments régulateurs pourra permettre de contribuer à l annotation fonctionnelle de ces éléments.Transcription factor binding sites are regulatory elements involved in gene expression regulation. The knowledge of promoter architecture is now possible due to genome annotation and transcriptomic data. Some regulatory elements are conserved at a precise location in promoters. We developed an approach to characterize such motifs in A. thaliana. This work led to the promoter cartography by the identification of 5105 over-represented motifs in proximal promoters. The TATA-box is a regulatory element conserved within eukaryotes. The core-promoter where this element is expected has been thoroughly analysed. We identified a list of 15 functional variants of the TATA-box and a new class of regulatory elements that shares the TATA-box topological constraints: the TC-motifs. They are conserved in both A. thaliana and O. sativa and have not been observed in mammalian genomes. The A. thaliana genes containing a TC-motif are 18%. They are mainly expressed in specific experimental conditions. The TC-motifs might be involved in gene expression regulation. We observed that the 4 dinucleotides of the initiator element YR in A. thaliana are extended in 5 UTR. Associations between these regulatory elements may highlight a functional collaboration. The study of the functional characteristics of genes with a same regulatory elements organization might help in these elements functional annotation.EVRY-Bib. électronique (912289901) / SudocSudocFranceF

OpenGrey Repository

Analyse de l'évolution du génome d'Arabidopsis thaliana par l'étude de familles de gènes

Author: BOUDET Nathalie
LECHARNY Alain
Publication venue
Publication date: 01/01/2002
Field of study

PARIS7-Bibliothèque centrale (751132105) / SudocSudocFranceF

OpenGrey Repository