237 research outputs found

    Gene conversion in human rearranged immunoglobulin genes

    Get PDF
    Over the past 20 years, many DNA sequences have been published suggesting that all or part of the V<sub>H</sub> segment of a rearranged immunoglobulin gene may be replaced in vivo. Two different mechanisms appear to be operating. One of these is very similar to primary V(D)J recombination, involving the RAG proteins acting upon recombination signal sequences, and this has recently been proven to occur. Other sequences, many of which show partial V<sub>H</sub> replacements with no addition of untemplated nucleotides at the V<sub>H</sub>–V<sub>H</sub> joint, have been proposed to occur by an unusual RAG-mediated recombination with the formation of hybrid (coding-to-signal) joints. These appear to occur in cells already undergoing somatic hypermutation in which, some authors are convinced, RAG genes are silenced. We recently proposed that the latter type of V<sub>H</sub> replacement might occur by homologous recombination initiated by the activity of AID (activation-induced cytidine deaminase), which is essential for somatic hypermutation and gene conversion. The latter has been observed in other species, but not in human Ig genes, so far. In this paper, we present a new analysis of sequences published as examples of the second type of rearrangement. This not only shows that AID recognition motifs occur in recombination regions but also that some sequences show replacement of central sections by a sequence from another gene, similar to gene conversion in the immunoglobulin genes of other species. These observations support the proposal that this type of rearrangement is likely to be AID-mediated rather than RAG-mediated and is consistent with gene conversion

    The Life-Cycle of Operons

    Get PDF
    Operons are a major feature of all prokaryotic genomes, but how and why operon structures vary is not well understood. To elucidate the life-cycle of operons, we compared gene order between Escherichia coli K12 and its relatives and identified the recently formed and destroyed operons in E. coli. This allowed us to determine how operons form, how they become closely spaced, and how they die. Our findings suggest that operon evolution may be driven by selection on gene expression patterns. First, both operon creation and operon destruction lead to large changes in gene expression patterns. For example, the removal of lysA and ruvA from ancestral operons that contained essential genes allowed their expression to respond to lysine levels and DNA damage, respectively. Second, some operons have undergone accelerated evolution, with multiple new genes being added during a brief period. Third, although genes within operons are usually closely spaced because of a neutral bias toward deletion and because of selection against large overlaps, genes in highly expressed operons tend to be widely spaced because of regulatory fine-tuning by intervening sequences. Although operon evolution may be adaptive, it need not be optimal: new operons often comprise functionally unrelated genes that were already in proximity before the operon formed

    A Detailed History of Intron-rich Eukaryotic Ancestors Inferred from a Global Survey of 100 Complete Genomes

    Get PDF
    Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6–7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing

    Sm/Lsm Genes Provide a Glimpse into the Early Evolution of the Spliceosome

    Get PDF
    The spliceosome, a sophisticated molecular machine involved in the removal of intervening sequences from the coding sections of eukaryotic genes, appeared and subsequently evolved rapidly during the early stages of eukaryotic evolution. The last eukaryotic common ancestor (LECA) had both complex spliceosomal machinery and some spliceosomal introns, yet little is known about the early stages of evolution of the spliceosomal apparatus. The Sm/Lsm family of proteins has been suggested as one of the earliest components of the emerging spliceosome and hence provides a first in-depth glimpse into the evolving spliceosomal apparatus. An analysis of 335 Sm and Sm-like genes from 80 species across all three kingdoms of life reveals two significant observations. First, the eukaryotic Sm/Lsm family underwent two rapid waves of duplication with subsequent divergence resulting in 14 distinct genes. Each wave resulted in a more sophisticated spliceosome, reflecting a possible jump in the complexity of the evolving eukaryotic cell. Second, an unusually high degree of conservation in intron positions is observed within individual orthologous Sm/Lsm genes and between some of the Sm/Lsm paralogs. This suggests that functional spliceosomal introns existed before the emergence of the complete Sm/Lsm family of proteins; hence, spliceosomal machinery with considerably fewer components than today's spliceosome was already functional

    Phylogenomics: Gene Duplication, Unrecognized Paralogy and Outgroup Choice

    Get PDF
    Comparative genomics has revealed the ubiquity of gene and genome duplication and subsequent gene loss. In the case of gene duplication and subsequent loss, gene trees can differ from species trees, thus frequent gene duplication poses a challenge for reconstruction of species relationships. Here I address the case of multi-gene sets of putative orthologs that include some unrecognized paralogs due to ancestral gene duplication, and ask how outgroups should best be chosen to reduce the degree of non-species tree (NST) signal. Consideration of expected internal branch lengths supports several conclusions: (i) when a single outgroup is used, the degree of NST signal arising from gene duplication is either independent of outgroup choice, or is minimized by use of a maximally closely related post-duplication (MCRPD) outgroup; (ii) when two outgroups are used, NST signal is minimized by using one MCRPD outgroup, while the position of the second outgroup is of lesser importance; and (iii) when two outgroups are used, the ability to detect gene trees that are inconsistent with known aspects of the species tree is maximized by use of one MCRPD, and is either independent of the position of the second outgroup, or is maximized for a more distantly related second outgroup. Overall, these results generalize the utility of closely-related outgroups for phylogenetic analysis

    Intron Evolution: Testing Hypotheses of Intron Evolution Using the Phylogenomics of Tetraspanins

    Get PDF
    BACKGROUND: Although large scale informatics studies on introns can be useful in making broad inferences concerning patterns of intron gain and loss, more specific questions about intron evolution at a finer scale can be addressed using a gene family where structure and function are well known. Genome wide surveys of tetraspanins from a broad array of organisms with fully sequenced genomes are an excellent means to understand specifics of intron evolution. Our approach incorporated several new fully sequenced genomes that cover the major lineages of the animal kingdom as well as plants, protists and fungi. The analysis of exon/intron gene structure in such an evolutionary broad set of genomes allowed us to identify ancestral intron structure in tetraspanins throughout the eukaryotic tree of life. METHODOLOGY/PRINCIPAL FINDINGS: We performed a phylogenomic analysis of the intron/exon structure of the tetraspanin protein family. In addition, to the already characterized tetraspanin introns numbered 1 through 6 found in animals, three additional ancient, phase 0 introns we call 4a, 4b and 4c were found. These three novel introns in combination with the ancestral introns 1 to 6, define three basic tetraspanin gene structures which have been conserved throughout the animal kingdom. Our phylogenomic approach also allows the estimation of the time at which the introns of the 33 human tetraspanin paralogs appeared, which in many cases coincides with the concomitant acquisition of new introns. On the other hand, we observed that new introns (introns other than 1-6, 4a, b and c) were not randomly inserted into the tetraspanin gene structure. The region of tetraspanin genes corresponding to the small extracellular loop (SEL) accounts for only 10.5% of the total sequence length but had 46% of the new animal intron insertions. CONCLUSIONS/SIGNIFICANCE: Our results indicate that tests of intron evolution are strengthened by the phylogenomic approach with specific gene families like tetraspanins. These tests add to our understanding of genomic innovation coupled to major evolutionary divergence events, functional constraints and the timing of the appearance of evolutionary novelty

    Intron Dynamics in Ribosomal Protein Genes

    Get PDF
    The role of spliceosomal introns in eukaryotic genomes remains obscure. A large scale analysis of intron presence/absence patterns in many gene families and species is a necessary step to clarify the role of these introns. In this analysis, we used a maximum likelihood method to reconstruct the evolution of 2,961 introns in a dataset of 76 ribosomal protein genes from 22 eukaryotes and validated the results by a maximum parsimony method. Our results show that the trends of intron gain and loss differed across species in a given kingdom but appeared to be consistent within subphyla. Most subphyla in the dataset diverged around 1 billion years ago, when the “Big Bang” radiation occurred. We speculate that spliceosomal introns may play a role in the explosion of many eukaryotes at the Big Bang radiation

    A proteogenomic update to Yersinia: enhancing genome annotation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Modern biomedical research depends on a complete and accurate proteome. With the widespread adoption of new sequencing technologies, genome sequences are generated at a near exponential rate, diminishing the time and effort that can be invested in genome annotation. The resulting gene set contains numerous errors in even the most basic form of annotation: the primary structure of the proteins.</p> <p>Results</p> <p>The application of experimental proteomics data to genome annotation, called proteogenomics, can quickly and efficiently discover misannotations, yielding a more accurate and complete genome annotation. We present a comprehensive proteogenomic analysis of the plague bacterium, <it>Yersinia pestis KIM</it>. We discover non-annotated genes, correct protein boundaries, remove spuriously annotated ORFs, and make major advances towards accurate identification of signal peptides. Finally, we apply our data to 21 other <it>Yersinia </it>genomes, correcting and enhancing their annotations.</p> <p>Conclusions</p> <p>In total, 141 gene models were altered and have been updated in RefSeq and Genbank, which can be accessed seamlessly through any NCBI tool (e.g. blast) or downloaded directly. Along with the improved gene models we discover new, more accurate means of identifying signal peptides in proteomics data.</p

    Localization of a bacterial group II intron-encoded protein in human cells

    Get PDF
    Group II introns are mobile retroelements that self-splice from precursor RNAs to form ribonucleoparticles (RNP), which can invade new specific genomic DNA sites. This specificity can be reprogrammed, for insertion into any desired DNA site, making these introns useful tools for bacterial genetic engineering. However, previous studies have suggested that these elements may function inefficiently in eukaryotes. We investigated the subcellular distribution, in cultured human cells, of the protein encoded by the group II intron RmInt1 (IEP) and several mutants. We created fusions with yellow fluorescent protein (YFP) and with a FLAG epitope. We found that the IEP was localized in the nucleus and nucleolus of the cells. Remarkably, it also accumulated at the periphery of the nuclear matrix. We were also able to identify spliced lariat intron RNA, which co-immunoprecipitated with the IEP, suggesting that functional RmInt1 RNPs can be assembled in cultured human cells.This work was supported by research grants CSD 2009–0006 from the Consolider-Ingenio, BIO2011-24401 and BIO2014-51953-P from the Spanish Ministerio de Economía y Competitividad all including ERDF (European Regional Development Funds). We thank Dr. Antonio Barrientos Durán for technical advice. MRC was supported by an FPI Ph.D grant. J.L.G.P´s laboratory is supported by CICE-FEDER-P09-CTS-4980, CICE-FEDER-P12-CTS-2256, Plan Nacional de I+D+I 2008–2011 and 2013–2016 (FIS-FEDER-PI11/01489 and FIS-FEDER-PI14/02152), PCIN-2014-115-ERA-NET NEURON II, the European Research Council (ERC-Consolidator ERC-STG-2012-233764) and by an International Early Career Scientist grant from the Howard Hughes Medical Institute (IECS-55007420).Peer Reviewe

    AUG_hairpin: prediction of a downstream secondary structure influencing the recognition of a translation start site

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The translation start site plays an important role in the control of translation efficiency of eukaryotic mRNAs. The recognition of the start AUG codon by eukaryotic ribosomes is considered to depend on its nucleotide context. However, the fraction of eukaryotic mRNAs with the start codon in a suboptimal context is relatively large. It may be expected that mRNA should possess some features providing efficient translation, including the proper recognition of a translation start site. It has been experimentally shown that a downstream hairpin located in certain positions with respect to start codon can compensate in part for the suboptimal AUG context and also increases translation from non-AUG initiation codons. Prediction of such a compensatory hairpin may be useful in the evaluation of eukaryotic mRNA translation properties.</p> <p>Results</p> <p>We evaluated interdependency between the start codon context and mRNA secondary structure at the CDS beginning: it was found that a suboptimal start codon context significantly correlated with higher base pairing probabilities at positions 13 – 17 of CDS of human and mouse mRNAs. It is likely that the downstream hairpins are used to enhance translation of some mammalian mRNAs <it>in vivo</it>. Thus, we have developed a tool, <it>AUG_hairpin</it>, to predict local stem-loop structures located within the defined region at the beginning of mRNA coding part. The implemented algorithm is based on the available published experimental data on the CDS-located stem-loop structures influencing the recognition of upstream start codons.</p> <p>Conclusion</p> <p>An occurrence of a potential secondary structure downstream of start AUG codon in a suboptimal context (or downstream of a potential non-AUG start codon) may provide researchers with a testable assumption on the presence of additional regulatory signal influencing mRNA translation initiation rate and the start codon choice. <it>AUG_hairpin</it>, which has a convenient Web-interface with adjustable parameters, will make such an evaluation easy and efficient.</p
    • …
    corecore