31 research outputs found
Comparative analysis indicates that alternative splicing in plants has a limited role in functional expansion of the proteome
<p>Abstract</p> <p>Background</p> <p>Alternative splicing (AS) is a widespread phenomenon in higher eukaryotes but the extent to which it leads to functional protein isoforms and to proteome expansion at large is still a matter of debate. In contrast to animal species, for which AS has been studied extensively at the protein and functional level, protein-centered studies of AS in plant species are scarce. Here we investigate the functional impact of AS in dicot and monocot plant species using a comparative approach.</p> <p>Results</p> <p>Detailed comparison of AS events in alternative spliced orthologs from the dicot <it>Arabidopsis thaliana </it>and the monocot <it>Oryza sativa </it>(rice) revealed that the vast majority of AS events in both species do not result from functional conservation. Transcript isoforms that are putative targets for the nonsense-mediated decay (NMD) pathway are as likely to contain conserved AS events as isoforms that are translated into proteins. Similar results were obtained when the same comparison was performed between the two more closely related monocot species rice and <it>Zea mays </it>(maize).</p> <p>Genome-wide computational analysis of functional protein domains encoded in alternatively and constitutively spliced genes revealed that only the RNA recognition motif (RRM) is overrepresented in alternatively spliced genes in all species analyzed. In contrast, three domain types were overrepresented in constitutively spliced genes. AS events were found to be less frequent within than outside predicted protein domains and no domain type was found to be enriched with AS introns. Analysis of AS events that result in the removal of complete protein domains revealed that only a small number of domain types is spliced-out in all species analyzed. Finally, in a substantial fraction of cases where a domain is completely removed, this domain appeared to be a unit of a tandem repeat.</p> <p>Conclusion</p> <p>The results from the ortholog comparisons suggest that the ability of a gene to produce more than one functional protein through AS does not persist during evolution. Cross-species comparison of the results of the protein-domain oriented analyses indicates little correspondence between the analyzed species. Based on the premise that functional genetic features are most likely to be conserved during evolution, we conclude that AS has only a limited role in functional expansion of the proteome in plants.</p
Comparative BAC end sequence analysis of tomato and potato reveals overrepresentation of specific gene families in potato
<p>Abstract</p> <p>Background</p> <p>Tomato (<it>Solanum lycopersicon</it>) and potato (<it>S. tuberosum</it>) are two economically important crop species, the genomes of which are currently being sequenced. This study presents a first genome-wide analysis of these two species, based on two large collections of BAC end sequences representing approximately 19% of the tomato genome and 10% of the potato genome.</p> <p>Results</p> <p>The tomato genome has a higher repeat content than the potato genome, primarily due to a higher number of retrotransposon insertions in the tomato genome. On the other hand, simple sequence repeats are more abundant in potato than in tomato. The two genomes also differ in the frequency distribution of SSR motifs. Based on EST and protein alignments, potato appears to contain up to 6,400 more putative coding regions than tomato. Major gene families such as cytochrome P450 mono-oxygenases and serine-threonine protein kinases are significantly overrepresented in potato, compared to tomato. Moreover, the P450 superfamily appears to have expanded spectacularly in both species compared to <it>Arabidopsis thaliana</it>, suggesting an expanded network of secondary metabolic pathways in the <it>Solanaceae</it>. Both tomato and potato appear to have a low level of microsynteny with <it>A. thaliana</it>. A higher degree of synteny was observed with <it>Populus trichocarpa</it>, specifically in the region between 15.2 and 19.4 Mb on <it>P. trichocarpa </it>chromosome 10.</p> <p>Conclusion</p> <p>The findings in this paper present a first glimpse into the evolution of Solanaceous genomes, both within the family and relative to other plant species. When the complete genome sequences of these species become available, whole-genome comparisons and protein- or repeat-family specific studies may shed more light on the observations made here.</p
Local coexpression domains in the genome of rice show no microsynteny with Arabidopsis domains
Chromosomal coexpression domains are found in a number of different genomes under various developmental conditions. The size of these domains and the number of genes they contain vary. Here, we define local coexpression domains as adjacent genes where all possible pair-wise correlations of expression data are higher than 0.7. In rice, such local coexpression domains range from predominantly two genes, up to 4, and make up ∼5% of the genomic neighboring genes, when examining different expression platforms from the public domain. The genes in local coexpression domains do not fall in the same ontology category significantly more than neighboring genes that are not coexpressed. Duplication, orientation or the distance between the genes does not solely explain coexpression. The regulation of coexpression is therefore thought to be regulated at the level of chromatin structure. The characteristics of the local coexpression domains in rice are strikingly similar to such domains in the Arabidopsis genome. Yet, no microsynteny between local coexpression domains in Arabidopsis and rice could be identified. Although the rice genome is not yet as extensively annotated as the Arabidopsis genome, the lack of conservation of local coexpression domains may indicate that such domains have not played a major role in the evolution of genome structure or in genome conservation
The Tomato Sequencing Project, the First Cornerstone of the International Solanaceae Project (SOL)
The genome of tomato (Solanum lycopersicum) is being sequenced by an international
consortium of 10 countries (Korea, China, the United Kingdom, India, The
Netherlands, France, Japan, Spain, Italy and the United States) as part of a larger initiative
called the ‘International Solanaceae Genome Project (SOL): Systems Approach
to Diversity and Adaptation’. The goal of this grassroots initiative, launched in
November 2003, is to establish a network of information, resources and scientists
to ultimately tackle two of the most significant questions in plant biology and agriculture:
(1) How can a common set of genes/proteins give rise to a wide range of
morphologically and ecologically distinct organisms that occupy our planet? (2) How
can a deeper understanding of the genetic basis of plant diversity be harnessed to
better meet the needs of society in an environmentally friendly and sustainable manner?
The Solanaceae and closely related species such as coffee, which are included
in the scope of the SOL project, are ideally suited to address both of these questions.
The first step of the SOL project is to use an ordered BAC approach to generate a
high quality sequence for the euchromatic portions of the tomato as a reference for
the Solanaceae. Due to the high level of macro and micro-synteny in the Solanaceae
the BAC-by-BAC tomato sequence will form the framework for shotgun sequencing
of other species. The starting point for sequencing the genome is BACs anchored
to the genetic map by overgo hybridization and AFLP technology. The overgos are
derived from approximately 1500 markers from the tomato high density F2-2000
genetic map (http://sgn.cornell.edu/). These seed BACs will be used as anchors from
which to radiate the tiling path using BAC end sequence data. Annotation will be
performed according to SOL project guidelines. All the information generated under
the SOL umbrella will be made available in a comprehensive website. The information
will be interlinked with the ultimate goal that the comparative biology of the
Solanaceae—and beyond—achieves a context that will facilitate a systems biology
approach
Local Coexpression Domains of Two to Four Genes in the Genome of Arabidopsis
Expression of genes in eukaryotic genomes is known to cluster, but cluster size is generally loosely defined and highly variable. We have here taken a very strict definition of cluster as sets of physically adjacent genes that are highly coexpressed and form so-called local coexpression domains. The Arabidopsis (Arabidopsis thaliana) genome was analyzed for the presence of such local coexpression domains to elucidate its functional characteristics. We used expression data sets that cover different experimental conditions, organs, tissues, and cells from the Massively Parallel Signature Sequencing repository and microarray data (Affymetrix) from a detailed root analysis. With these expression data, we identified 689 and 1,481 local coexpression domains, respectively, consisting of two to four genes with a pairwise Pearson's correlation coefficient larger than 0.7. This number is approximately 1- to 5-fold higher than the numbers expected by chance. A small (5%–10%) yet significant fraction of genes in the Arabidopsis genome is therefore organized into local coexpression domains. These local coexpression domains were distributed over the genome. Genes in such local domains were for the major part not categorized in the same functional category (GOslim). Neither tandemly duplicated genes nor shared promoter sequence nor gene distance explained the occurrence of coexpression of genes in such chromosomal domains. This indicates that other parameters in genes or gene positions are important to establish coexpression in local domains of Arabidopsis chromosomes