61 research outputs found
A Detailed History of Intron-rich Eukaryotic Ancestors Inferred from a Global Survey of 100 Complete Genomes
Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6–7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing
Intron Dynamics in Ribosomal Protein Genes
The role of spliceosomal introns in eukaryotic genomes remains obscure. A large scale analysis of intron presence/absence patterns in many gene families and species is a necessary step to clarify the role of these introns. In this analysis, we used a maximum likelihood method to reconstruct the evolution of 2,961 introns in a dataset of 76 ribosomal protein genes from 22 eukaryotes and validated the results by a maximum parsimony method. Our results show that the trends of intron gain and loss differed across species in a given kingdom but appeared to be consistent within subphyla. Most subphyla in the dataset diverged around 1 billion years ago, when the “Big Bang” radiation occurred. We speculate that spliceosomal introns may play a role in the explosion of many eukaryotes at the Big Bang radiation
Replication Protein A (RPA) Hampers the Processive Action of APOBEC3G Cytosine Deaminase on Single-Stranded DNA
deamination assays and expression of A3G in yeast, we show that replication protein A (RPA), the eukaryotic single-stranded DNA (ssDNA) binding protein, severely inhibits the deamination activity and processivity of A3G. on long ssDNA regions. This resembles the “hit and run” single base substitution events observed in yeast., we propose that RPA plays a role in the protection of the human genome cell from A3G and other deaminases when they are inadvertently diverged from their natural targets. We propose a model where RPA serves as one of the guardians of the genome that protects ssDNA from the destructive processive activity of deaminases by non-specific steric hindrance
Signs of positive selection of somatic mutations in human cancers detected by EST sequence analysis
BACKGROUND: Carcinogenesis typically involves multiple somatic mutations in caretaker (DNA repair) and gatekeeper (tumor suppressors and oncogenes) genes. Analysis of mutation spectra of the tumor suppressor that is most commonly mutated in human cancers, p53, unexpectedly suggested that somatic evolution of the p53 gene during tumorigenesis is dominated by positive selection for gain of function. This conclusion is supported by accumulating experimental evidence of evolution of new functions of p53 in tumors. These findings prompted a genome-wide analysis of possible positive selection during tumor evolution. METHODS: A comprehensive analysis of probable somatic mutations in the sequences of Expressed Sequence Tags (ESTs) from malignant tumors and normal tissues was performed in order to access the prevalence of positive selection in cancer evolution. For each EST, the numbers of synonymous and non-synonymous substitutions were calculated. In order to identify genes with a signature of positive selection in cancers, these numbers were compared to: i) expected numbers and ii) the numbers for the respective genes in the ESTs from normal tissues. RESULTS: We identified 112 genes with a signature of positive selection in cancers, i.e., a significantly elevated ratio of non-synonymous to synonymous substitutions, in tumors as compared to 37 such genes in an approximately equal-sized EST collection from normal tissues. A substantial fraction of the tumor-specific positive-selection candidates have experimentally demonstrated or strongly predicted links to cancer. CONCLUSION: The results of EST analysis should be interpreted with extreme caution given the noise introduced by sequencing errors and undetected polymorphisms. Furthermore, an inherent limitation of EST analysis is that multiple mutations amenable to statistical analysis can be detected only in relatively highly expressed genes. Nevertheless, the present results suggest that positive selection might affect a substantial number of genes during tumorigenic somatic evolution
Exploring Fold Space Preferences of New-born and Ancient Protein Superfamilies
The evolution of proteins is one of the fundamental processes that has delivered the diversity and complexity of life we see around ourselves today. While we tend to define protein evolution in terms of sequence level mutations, insertions and deletions, it is hard to translate these processes to a more complete picture incorporating a polypeptide's structure and function. By considering how protein structures change over time we can gain an entirely new appreciation of their long-term evolutionary dynamics. In this work we seek to identify how populations of proteins at different stages of evolution explore their possible structure space. We use an annotation of superfamily age to this space and explore the relationship between these ages and a diverse set of properties pertaining to a superfamily's sequence, structure and function. We note several marked differences between the populations of newly evolved and ancient structures, such as in their length distributions, secondary structure content and tertiary packing arrangements. In particular, many of these differences suggest a less elaborate structure for newly evolved superfamilies when compared with their ancient counterparts. We show that the structural preferences we report are not a residual effect of a more fundamental relationship with function. Furthermore, we demonstrate the robustness of our results, using significant variation in the algorithm used to estimate the ages. We present these age estimates as a useful tool to analyse protein populations. In particularly, we apply this in a comparison of domains containing greek key or jelly roll motifs
Evolutionary Convergence on Highly-Conserved 3′ Intron Structures in Intron-Poor Eukaryotes and Insights into the Ancestral Eukaryotic Genome
The presence of spliceosomal introns in eukaryotes raises a range of questions about genomic evolution. Along with the fundamental mysteries of introns' initial proliferation and persistence, the evolutionary forces acting on intron sequences remain largely mysterious. Intron number varies across species from a few introns per genome to several introns per gene, and the elements of intron sequences directly implicated in splicing vary from degenerate to strict consensus motifs. We report a 50-species comparative genomic study of intron sequences across most eukaryotic groups. We find two broad and striking patterns. First, we find that some highly intron-poor lineages have undergone evolutionary convergence to strong 3′ consensus intron structures. This finding holds for both branch point sequence and distance between the branch point and the 3′ splice site. Interestingly, this difference appears to exist within the genomes of green alga of the genus Ostreococcus, which exhibit highly constrained intron sequences through most of the intron-poor genome, but not in one much more intron-dense genomic region. Second, we find evidence that ancestral genomes contained highly variable branch point sequences, similar to more complex modern intron-rich eukaryotic lineages. In addition, ancestral structures are likely to have included polyT tails similar to those in metazoans and plants, which we found in a variety of protist lineages. Intriguingly, intron structure evolution appears to be quite different across lineages experiencing different types of genome reduction: whereas lineages with very few introns tend towards highly regular intronic sequences, lineages with very short introns tend towards highly degenerate sequences. Together, these results attest to the complex nature of ancestral eukaryotic splicing, the qualitatively different evolutionary forces acting on intron structures across modern lineages, and the impressive evolutionary malleability of eukaryotic gene structures
Genetic Variation in OAS1 Is a Risk Factor for Initial Infection with West Nile Virus in Man
West Nile virus (WNV) is a re-emerging pathogen that can cause fatal encephalitis. In mice, susceptibility to WNV has been reported to result from a single point mutation in oas1b, which encodes 2′–5′ oligoadenylate synthetase 1b, a member of the type I interferon-regulated OAS gene family involved in viral RNA degradation. In man, the human ortholog of oas1b appears to be OAS1. The ‘A’ allele at SNP rs10774671 of OAS1 has previously been shown to alter splicing of OAS1 and to be associated with reduced OAS activity in PBMCs. Here we show that the frequency of this hypofunctional allele is increased in both symptomatic and asymptomatic WNV seroconverters (Caucasians from five US centers; total n = 501; OR = 1.6 [95% CI 1.2–2.0], P = 0.0002 in a recessive genetic model). We then directly tested the effect of this SNP on viral replication in a novel ex vivo model of WNV infection in primary human lymphoid tissue. Virus accumulation varied markedly among donors, and was highest for individuals homozygous for the ‘A’ allele (P<0.0001). Together, these data identify OAS1 SNP rs10774671 as a host genetic risk factor for initial infection with WNV in humans
Discovery of a New Human Polyomavirus Associated with Trichodysplasia Spinulosa in an Immunocompromized Patient
The Polyomaviridae constitute a family of small DNA viruses infecting a variety of hosts. In humans, polyomaviruses can cause infections of the central nervous system, urinary tract, skin, and possibly the respiratory tract. Here we report the identification of a new human polyomavirus in plucked facial spines of a heart transplant patient with trichodysplasia spinulosa, a rare skin disease exclusively seen in immunocompromized patients. The trichodysplasia spinulosa-associated polyomavirus (TSV) genome was amplified through rolling-circle amplification and consists of a 5232-nucleotide circular DNA organized similarly to known polyomaviruses. Two putative “early” (small and large T antigen) and three putative “late” (VP1, VP2, VP3) genes were identified. The TSV large T antigen contains several domains (e.g. J-domain) and motifs (e.g. HPDKGG, pRb family-binding, zinc finger) described for other polyomaviruses and potentially involved in cellular transformation. Phylogenetic analysis revealed a close relationship of TSV with the Bornean orangutan polyomavirus and, more distantly, the Merkel cell polyomavirus that is found integrated in Merkel cell carcinomas of the skin. The presence of TSV in the affected patient's skin was confirmed by newly designed quantitative TSV-specific PCR, indicative of a viral load of 105 copies per cell. After topical cidofovir treatment, the lesions largely resolved coinciding with a reduction in TSV load. PCR screening demonstrated a 4% prevalence of TSV in an unrelated group of immunosuppressed transplant recipients without apparent disease. In conclusion, a new human polyomavirus was discovered and identified as the possible cause of trichodysplasia spinulosa in immunocompromized patients. The presence of TSV also in clinically unaffected individuals suggests frequent virus transmission causing subclinical, probably latent infections. Further studies have to reveal the impact of TSV infection in relation to other populations and diseases
Non-conventional sources of peptides presented by MHC class I
Effectiveness of immune surveillance of intracellular viruses and bacteria depends upon a functioning antigen presentation pathway that allows infected cells to reveal the presence of an intracellular pathogen. The antigen presentation pathway uses virtually all endogenous polypeptides as a source to produce antigenic peptides that are eventually chaperoned to the cell surface by MHC class I molecules. Intriguingly, MHC I molecules present peptides encoded not only in the primary open reading frames but also those encoded in alternate reading frames. Here, we review recent studies on the generation of cryptic pMHC I. We focus on the immunological significance of cryptic pMHC I, and the novel translational mechanisms that allow production of these antigenic peptides from unconventional sources
- …