326 research outputs found

    Patterns of intron gain and conservation in eukaryotic genes

    Get PDF
    <p>Abstract</p> <p>Background:</p> <p>The presence of introns in protein-coding genes is a universal feature of eukaryotic genome organization, and the genes of multicellular eukaryotes, typically, contain multiple introns, a substantial fraction of which share position in distant taxa, such as plants and animals. Depending on the methods and data sets used, researchers have reached opposite conclusions on the causes of the high fraction of shared introns in orthologous genes from distant eukaryotes. Some studies conclude that shared intron positions reflect, almost entirely, a remarkable evolutionary conservation, whereas others attribute it to parallel gain of introns. To resolve these contradictions, it is crucial to analyze the evolution of introns by using a model that minimally relies on arbitrary assumptions.</p> <p>Results:</p> <p>We developed a probabilistic model of evolution that allows for variability of intron gain and loss rates over branches of the phylogenetic tree, individual genes, and individual sites. Applying this model to an extended set of conserved eukaryotic genes, we find that parallel gain, on average, accounts for only ~8% of the shared intron positions. However, the distribution of parallel gains over the phylogenetic tree of eukaryotes is highly non-uniform. There are, practically, no parallel gains in closely related lineages, whereas for distant lineages, such as animals and plants, parallel gains appear to contribute up to 20% of the shared intron positions. In accord with these findings, we estimated that ancestral introns have a high probability to be retained in extant genomes, and conversely, that a substantial fraction of extant introns have retained their positions since the early stages of eukaryotic evolution. In addition, the density of sites that are available for intron insertion is estimated to be, approximately, one in seven basepairs.</p> <p>Conclusion:</p> <p>We obtained robust estimates of the contribution of parallel gain to the observed sharing of intron positions between eukaryotic species separated by different evolutionary distances. The results indicate that, although the contribution of parallel gains varies across the phylogenetic tree, the high level of intron position sharing is due, primarily, to evolutionary conservation. Accordingly, numerous introns appear to persist in the same position over hundreds of millions of years of evolution. This is compatible with recent observations of a negative correlation between the rate of intron gain and coding sequence evolution rate of a gene, suggesting that at least some of the introns are functionally relevant.</p

    A highly conserved family of inactivated archaeal B family DNA polymerases

    Get PDF
    <p>Abstract</p> <p>A widespread and highly conserved family of apparently inactivated derivatives of archaeal B-family DNA polymerases is described. Phylogenetic analysis shows that the inactivated forms comprise a distinct clade among archaeal B-family polymerases and that, within this clade, Euryarchaea and Crenarchaea are clearly separated from each other and from a small group of bacterial homologs. These findings are compatible with an ancient duplication of the DNA polymerase gene followed by inactivation and parallel loss in some of the lineages although contribution of horizontal gene transfer cannot be ruled out. The inactivated derivative of the archaeal DNA polymerase could form a complex with the active paralog and play a structural role in DNA replication.</p> <p>Reviewers</p> <p>This article was reviewed by Purificacion Lopez-Garcia and Chris Ponting. For the full reviews, please go to the Reviewers' Reports section.</p

    EREM: Parameter Estimation and Ancestral Reconstruction by Expectation-Maximization Algorithm for a Probabilistic Model of Genomic Binary Characters Evolution

    Get PDF
    Evolutionary binary characters are features of species or genes, indicating the absence (value zero) or presence (value one) of some property. Examples include eukaryotic gene architecture (the presence or absence of an intron in a particular locus), gene content, and morphological characters. In many studies, the acquisition of such binary characters is assumed to represent a rare evolutionary event, and consequently, their evolution is analyzed using various flavors of parsimony. However, when gain and loss of the character are not rare enough, a probabilistic analysis becomes essential. Here, we present a comprehensive probabilistic model to describe the evolution of binary characters on a bifurcating phylogenetic tree. A fast software tool, EREM, is provided, using maximum likelihood to estimate the parameters of the model and to reconstruct ancestral states (presence and absence in internal nodes) and events (gain and loss events along branches)

    Expression of human AID in yeast induces mutations in context similar to the context of somatic hypermutation at G-C pairs in immunoglobulin genes

    Get PDF
    BACKGROUND: Antibody genes are diversified by somatic hypermutation (SHM), gene conversion and class-switch recombination. All three processes are initiated by the activation-induced deaminase (AID). According to a DNA deamination model of SHM, AID converts cytosine to uracil in DNA sequences. The initial deamination of cytosine leads to mutation and recombination in pathways involving replication, DNA mismatch repair and possibly base excision repair. The DNA sequence context of mutation hotspots at G-C pairs during SHM is DGYW/WRCH (G-C is a hotspot position, R = A/G, Y = T/C, W = A/T, D = A/G/T). RESULTS: To investigate the mechanisms of AID-induced mutagenesis in a model system, we studied the genetic consequences of AID expression in yeast. We constructed a yeast vector with an artificially synthesized human AID gene insert using codons common to highly expressed yeast genes. We found that expression of the artificial hAIDSc gene was moderately mutagenic in a wild-type strain and highly mutagenic in an ung1 uracil-DNA glycosylase-deficient strain. A majority of mutations were at G-C pairs. In the ung1 strain, C-G to T-A transitions were found almost exclusively, while a mixture of transitions with 12% transversions was characteristic in the wild-type strain. In the ung1 strain mutations that could have originated from deamination of the transcribed stand were found more frequently. In the wild-type strain, the strand bias was reversed. DGYW/WRCH motifs were preferential sites of mutations. CONCLUSION: The results are consistent with the hypothesis that AID-mediated deamination of DNA is a major cause of mutations at G-C base pairs in immunoglobulin genes during SHM. The sequence contexts of mutations in yeast induced by AID and those of somatic mutations at G-C pairs in immunoglobulin genes are significantly similar. This indicates that the intrinsic substrate specificity of AID itself is a primary determinant of mutational hotspots at G-C base pairs during SHM

    Evolution of DNA polymerases: an inactivated polymerase-exonuclease module in Pol ε and a chimeric origin of eukaryotic polymerases from two classes of archaeal ancestors

    Get PDF
    Background: Evolution of DNA polymerases, the key enzymes of DNA replication and repair, is central to any reconstruction of the history of cellular life. However, the details of the evolutionary relationships between DNA polymerases of archaea and eukaryotes remain unresolved. Results: We performed a comparative analysis of archaeal, eukaryotic, and bacterial B-family DNA polymerases, which are the main replicative polymerases in archaea and eukaryotes, combined with an analysis of domain architectures. Surprisingly, we found that eukaryotic Polymerase ε consists of two tandem exonuclease-polymerase modules, the active N-terminal module and a C-terminal module in which both enzymatic domains are inactivated. The two modules are only distantly related to each other, an observation that suggests the possibility that Pol ε evolved as a result of insertion and subsequent inactivation of a distinct polymerase, possibly, of bacterial descent, upstream of the C-terminal Zn-fingers, rather than by tandem duplication. The presence of an inactivated exonuclease-polymerase module in Pol ε parallels a similar inactivation of both enzymatic domains in a distinct family of archaeal B-family polymerases. The results of phylogenetic analysis indicate that eukaryotic B-family polymerases, most likely, originate from two distantly related archaeal B-family polymerases, one form giving rise to Pol ε, and the other one to the common ancestor of Pol α, Pol δ, and Pol ζ. The C-terminal Zn-fingers that are present in all eukaryotic B-family polymerases, unexpectedly, are homologous to the Zn-finger of archaeal D-family DNA polymerases that are otherwise unrelated to the B family. The Zn-finger of Polε shows a markedly greater similarity to the counterpart in archaeal PolD than the Zn-fingers of other eukaryotic B-family polymerases. Conclusion: Evolution of eukaryotic DNA polymerases seems to have involved previously unnoticed complex events. We hypothesize that the archaeal ancestor of eukaryotes encoded three DNA polymerases, namely, two distinct B-family polymerases and a D-family polymerase all of which contributed to the evolution of the eukaryotic replication machinery. The Zn-finger might have been acquired from PolD by the B-family form that gave rise to Pol ε prior to or in the course of eukaryogenesis, and subsequently, was captured by the ancestor of the other B-family eukaryotic polymerases. The inactivated polymerase-exonuclease module of Pol ε might have evolved by fusion with a distinct polymerase, rather than by duplication of the active module of Pol ε, and is likely to play an important role in the assembly of eukaryotic replication and repair complexes. Reviewers: This article was reviewed by Patrick Forterre, Arcady Mushegian, and Chris Ponting. For the full reviews, pleas

    Low-fidelity DNA synthesis by human DNA polymerase theta

    Get PDF
    Human DNA polymerase theta (pol θ or POLQ) is a proofreading-deficient family A enzyme implicated in translesion synthesis (TLS) and perhaps in somatic hypermutation (SHM) of immunoglobulin genes. These proposed functions and kinetic studies imply that pol θ may synthesize DNA with low fidelity. Here, we show that when copying undamaged DNA, pol θ generates single base errors at rates 10- to more than 100-fold higher than for other family A members. Pol θ adds single nucleotides to homopolymeric runs at particularly high rates, exceeding 1% in certain sequence contexts, and generates single base substitutions at an average rate of 2.4 × 10−3, comparable to inaccurate family Y human pol κ (5.8 × 10−3) also implicated in TLS. Like pol κ, pol θ is processive, implying that it may be tightly regulated to avoid deleterious mutagenesis. Pol θ also generates certain base substitutions at high rates within sequence contexts similar to those inferred to be copied by pol θ during SHM of immunoglobulin genes in mice. Thus, pol θ is an exception among family A polymerases, and its low fidelity is consistent with its proposed roles in TLS and SHM

    Gene conversion in human rearranged immunoglobulin genes

    Get PDF
    Over the past 20 years, many DNA sequences have been published suggesting that all or part of the V&lt;sub&gt;H&lt;/sub&gt; segment of a rearranged immunoglobulin gene may be replaced in vivo. Two different mechanisms appear to be operating. One of these is very similar to primary V(D)J recombination, involving the RAG proteins acting upon recombination signal sequences, and this has recently been proven to occur. Other sequences, many of which show partial V&lt;sub&gt;H&lt;/sub&gt; replacements with no addition of untemplated nucleotides at the V&lt;sub&gt;H&lt;/sub&gt;–V&lt;sub&gt;H&lt;/sub&gt; joint, have been proposed to occur by an unusual RAG-mediated recombination with the formation of hybrid (coding-to-signal) joints. These appear to occur in cells already undergoing somatic hypermutation in which, some authors are convinced, RAG genes are silenced. We recently proposed that the latter type of V&lt;sub&gt;H&lt;/sub&gt; replacement might occur by homologous recombination initiated by the activity of AID (activation-induced cytidine deaminase), which is essential for somatic hypermutation and gene conversion. The latter has been observed in other species, but not in human Ig genes, so far. In this paper, we present a new analysis of sequences published as examples of the second type of rearrangement. This not only shows that AID recognition motifs occur in recombination regions but also that some sequences show replacement of central sections by a sequence from another gene, similar to gene conversion in the immunoglobulin genes of other species. These observations support the proposal that this type of rearrangement is likely to be AID-mediated rather than RAG-mediated and is consistent with gene conversion
    corecore