15,777 research outputs found

    Intron-loss evolution of hatching enzyme genes in Teleostei

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Hatching enzyme, belonging to the astacin metallo-protease family, digests egg envelope at embryo hatching. Orthologous genes of the enzyme are found in all vertebrate genomes. Recently, we found that exon-intron structures of the genes were conserved among tetrapods, while the genes of teleosts frequently lost their introns. Occurrence of such intron losses in teleostean hatching enzyme genes is an uncommon evolutionary event, as most eukaryotic genes are generally known to be interrupted by introns and the intron insertion sites are conserved from species to species. Here, we report on extensive studies of the exon-intron structures of teleostean hatching enzyme genes for insight into how and why introns were lost during evolution.</p> <p>Results</p> <p>We investigated the evolutionary pathway of intron-losses in hatching enzyme genes of 27 species of Teleostei. Hatching enzyme genes of basal teleosts are of only one type, which conserves the 9-exon-8-intron structure of an assumed ancestor. On the other hand, otocephalans and euteleosts possess two types of hatching enzyme genes, suggesting a gene duplication event in the common ancestor of otocephalans and euteleosts. The duplicated genes were classified into two clades, clades I and II, based on phylogenetic analysis. In otocephalans and euteleosts, clade I genes developed a phylogeny-specific structure, such as an 8-exon-7-intron, 5-exon-4-intron, 4-exon-3-intron or intron-less structure. In contrast to the clade I genes, the structures of clade II genes were relatively stable in their configuration, and were similar to that of the ancestral genes. Expression analyses revealed that hatching enzyme genes were high-expression genes, when compared to that of housekeeping genes. When expression levels were compared between clade I and II genes, clade I genes tends to be expressed more highly than clade II genes.</p> <p>Conclusions</p> <p>Hatching enzyme genes evolved to lose their introns, and the intron-loss events occurred at the specific points of teleostean phylogeny. We propose that the high-expression hatching enzyme genes frequently lost their introns during the evolution of teleosts, while the low-expression genes maintained the exon-intron structure of the ancestral gene.</p

    ExDom: an integrated database for comparative analysis of the exon–intron structures of protein domains in eukaryotes

    Get PDF
    We have developed ExDom, a unique database for the comparative analysis of the exon–intron structures of 96 680 protein domains from seven eukaryotic organisms (Homo sapiens, Mus musculus, Bos taurus, Rattus norvegicus, Danio rerio, Gallus gallus and Arabidopsis thaliana). ExDom provides integrated access to exon-domain data through a sophisticated web interface which has the following analytical capabilities: (i) intergenomic and intragenomic comparative analysis of exon–intron structure of domains; (ii) color-coded graphical display of the domain architecture of proteins correlated with their corresponding exon-intron structures; (iii) graphical analysis of multiple sequence alignments of amino acid and coding nucleotide sequences of homologous protein domains from seven organisms; (iv) comparative graphical display of exon distributions within the tertiary structures of protein domains; and (v) visualization of exon–intron structures of alternative transcripts of a gene correlated to variations in the domain architecture of corresponding protein isoforms. These novel analytical features are highly suited for detailed investigations on the exon–intron structure of domains and make ExDom a powerful tool for exploring several key questions concerning the function, origin and evolution of genes and proteins. ExDom database is freely accessible at: http://66.170.16.154/ExDom/

    Improved ontology for eukaryotic single-exon coding sequences in biological databases

    Get PDF
    Indexación: Scopus.Efficient extraction of knowledge from biological data requires the development of structured vocabularies to unambiguously define biological terms. This paper proposes descriptions and definitions to disambiguate the term 'single-exon gene'. Eukaryotic Single-Exon Genes (SEGs) have been defined as genes that do not have introns in their protein coding sequences. They have been studied not only to determine their origin and evolution but also because their expression has been linked to several types of human cancer and neurological/developmental disorders and many exhibit tissue-specific transcription. Unfortunately, the term 'SEGs' is rife with ambiguity, leading to biological misinterpretations. In the classic definition, no distinction is made between SEGs that harbor introns in their untranslated regions (UTRs) versus those without. This distinction is important to make because the presence of introns in UTRs affects transcriptional regulation and post-transcriptional processing of the mRNA. In addition, recent whole-transcriptome shotgun sequencing has led to the discovery of many examples of single-exon mRNAs that arise from alternative splicing of multi-exon genes, these single-exon isoforms are being confused with SEGs despite their clearly different origin. The increasing expansion of RNA-seq datasets makes it imperative to distinguish the different SEG types before annotation errors become indelibly propagated in biological databases. This paper develops a structured vocabulary for their disambiguation, allowing a major reassessment of their evolutionary trajectories, regulation, RNA processing and transport, and provides the opportunity to improve the detection of gene associations with disorders including cancers, neurological and developmental diseases. © The Author(s) 2018. Published by Oxford University Press.https://academic.oup.com/database/article/doi/10.1093/database/bay089/509943

    Regulation of splicing factors by alternative splicing and NMD is conserved between kingdoms yet evolutionarily flexible.

    Get PDF
    Ultraconserved elements, unusually long regions of perfect sequence identity, are found in genes encoding numerous RNA-binding proteins including arginine-serine rich (SR) splicing factors. Expression of these genes is regulated via alternative splicing of the ultraconserved regions to yield mRNAs that are degraded by nonsense-mediated mRNA decay (NMD), a process termed unproductive splicing (Lareau et al. 2007; Ni et al. 2007). As all human SR genes are affected by alternative splicing and NMD, one might expect this regulation to have originated in an early SR gene and persisted as duplications expanded the SR family. But in fact, unproductive splicing of most human SR genes arose independently (Lareau et al. 2007). This paradox led us to investigate the origin and proliferation of unproductive splicing in SR genes. We demonstrate that unproductive splicing of the splicing factor SRSF5 (SRp40) is conserved among all animals and even observed in fungi; this is a rare example of alternative splicing conserved between kingdoms, yet its effect is to trigger mRNA degradation. As the gene duplicated, the ancient unproductive splicing was lost in paralogs, and distinct unproductive splicing evolved rapidly and repeatedly to take its place. SR genes have consistently employed unproductive splicing, and while it is exceptionally conserved in some of these genes, turnover in specific events among paralogs shows flexible means to the same regulatory end

    SinEx DB: a database for single exon coding sequences in mammalian genomes

    Get PDF
    IndexaciĂłn: Web of Science.Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as 'single exon genes' (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs.https://academic.oup.com/database/article-lookup/doi/10.1093/database/baw09

    Exon-phase symmetry and intrinsic structural disorder promote modular evolution in the human genome

    Get PDF
    A key signature of module exchange in the genome is phase symmetry of exons, suggestive of exon shuffling events that occurred without disrupting translation reading frame. At the protein level, intrinsic structural disorder may be another key element because disordered regions often serve as functional elements that can be effectively integrated into a protein structure. Therefore, we asked whether exon-phase symmetry in the human genome and structural disorder in the human proteome are connected, signalling such evolutionary mechanisms in the assembly of multi-exon genes. We found an elevated level of structural disorder of regions encoded by symmetric exons and a preferred symmetry of exons encoding for mostly disordered regions (>70% predicted disorder). Alternatively spliced symmetric exons tend to correspond to the most disordered regions. The genes of mostly disordered proteins (>70% predicted disorder) tend to be assembled from symmetric exons, which often arise by internal tandem duplications. Preponderance of certain types of short motifs (e.g. SH3-binding motif) and domains (e.g. high-mobility group domains) suggests that certain disordered modules have been particularly effective in exon-shuffling events. Our observations suggest that structural disorder has facilitated modular assembly of complex genes in evolution of the human genome. © 2013 The Author(s)

    In search of lost introns

    Full text link
    Many fundamental questions concerning the emergence and subsequent evolution of eukaryotic exon-intron organization are still unsettled. Genome-scale comparative studies, which can shed light on crucial aspects of eukaryotic evolution, require adequate computational tools. We describe novel computational methods for studying spliceosomal intron evolution. Our goal is to give a reliable characterization of the dynamics of intron evolution. Our algorithmic innovations address the identification of orthologous introns, and the likelihood-based analysis of intron data. We discuss a compression method for the evaluation of the likelihood function, which is noteworthy for phylogenetic likelihood problems in general. We prove that after O(nL)O(nL) preprocessing time, subsequent evaluations take O(nL/log⁥L)O(nL/\log L) time almost surely in the Yule-Harding random model of nn-taxon phylogenies, where LL is the input sequence length. We illustrate the practicality of our methods by compiling and analyzing a data set involving 18 eukaryotes, more than in any other study to date. The study yields the surprising result that ancestral eukaryotes were fairly intron-rich. For example, the bilaterian ancestor is estimated to have had more than 90% as many introns as vertebrates do now

    Structural dynamics and divergence of the polygalacturonase gene family in land plants

    Get PDF
    A distinct feature of eukaryotic genomes is the presence of gene families. The polygalacturonase (PG) (EC3.2.1.15) gene family is one of the largest gene families in plants. PG is a pectin-digesting enzyme with a glycoside hydrolase 28 domain. It is involved in numerous plant developmental processes. The evolutionary processes accounting for the functional divergence and the specialized functions of PGs in land plants are unclear. Here, phylogenetic and gene structure analysis of PG genes in algae and land plants revealed that land plant PG genes resulted from differential intron gain and loss, with the latter event predominating. PG genes in land plants contained 15 homologous intron blocks and 13 novel intron blocks. Intron position and phase were not conserved between PGs of algae and land plants but conserved among PG genes of land plants from moss to vascular plants, indicating that the current introns in the PGs in land plants appeared after the split between unicellular algae and multicelluar land plants. These findings demonstrate that the functional divergence and differentiation of PGs in land plants is attributable to intronic loss. Moreover, they underscore the importance of intron gain and loss in genomic adaptation to selective pressure

    Delving into Vertebrate Serpins for Understanding their Evolution

    Get PDF
    The superfamily of serine proteinase inhibitors (serpins) is involved in an array of fundamental biological processes such as blood coagulation, cell differentiation, cell migration, complement activation, embryo implantation, fibrinolysis, angiogenesis, and inflammation, and tumor suppression. Vertebrate serpins can be conveniently classified into six sub-groups, based on three independent biological features - genomic organization, diagnostic amino acid sites and rare indels. The present vertebrate serpins are derived from an original serpin most probably by intron insertion and we are trying to reconstruct the phylogeny of vertebrate serpin and looking for the reconstruction of original vertebrate gene(s). We started with fish genomes and characterized fish serpins and assigned orthology with respect to human serpins. Most fish serpins are characterised as stereotype vertebrate serpins with some interesting exceptions which suggest that either there are some fish-specific serpins or some fish serpins do not have human orthologs.&#xd;&#xa;&#xd;&#xa;Presented at &#x22;BREW 2005&#x22;:http://cmb.molgen.mpg.de/brew/program.html
    • 

    corecore