309 research outputs found
Efficient inference of bacterial strain trees from genome-scale multilocus data
Motivation: In bacterial evolution, inferring a strain tree, which is the evolutionary history of different strains of the same bacterium, plays a major role in analyzing and understanding the evolution of strongly isolated populations, population divergence and various evolutionary events, such as horizontal gene transfer and homologous recombination. Inferring a strain tree from multilocus data of these strains is exceptionally hard since, at this scale of evolution, processes such as homologous recombination result in a very high degree of gene tree incongruence
Duplication and Gene Conversion in the Drosophila melanogaster Genome
Using the genomic sequences of Drosophila melanogaster subgroup, the pattern of gene duplications was investigated with special attention to interlocus gene conversion. Our fine-scale analysis with careful visual inspections enabled accurate identification of a number of duplicated blocks (genomic regions). The orthologous parts of those duplicated blocks were also identified in the D. simulans and D. sechellia genomes, by which we were able to clearly classify the duplicated blocks into post- and pre-speciation blocks. We found 31 post-speciation duplicated genes, from which the rate of gene duplication (from one copy to two copies) is estimated to be 1.0×10−9 per single-copy gene per year. The role of interlocus gene conversion was observed in several respects in our data: (1) synonymous divergence between a duplicated pair is overall very low. Consequently, the gene duplication rate would be seriously overestimated by counting duplicated genes with low divergence; (2) the sizes of young duplicated blocks are generally large. We postulate that the degeneration of gene conversion around the edges could explain the shrinkage of “identifiable” duplicated regions; and (3) elevated paralogous divergence is observed around the edges in many duplicated blocks, supporting our gene conversion–degeneration model. Our analysis demonstrated that gene conversion between duplicated regions is a common and genome-wide phenomenon in the Drosophila genomes, and that its role should be especially significant in the early stages of duplicated genes. Based on a population genetic prediction, we applied a new genome-scan method to test for signatures of selection for neofunctionalization and found a strong signature in a pair of transporter genes
Balancing Selection at the Tomato RCR3 Guardee Gene Family Maintains Variation in Strength of Pathogen Defense
Coevolution between hosts and pathogens is thought to occur between interacting molecules of both species. This results in the maintenance of genetic diversity at pathogen antigens (or so-called effectors) and host resistance genes such as the major histocompatibility complex (MHC) in mammals or resistance (R) genes in plants. In plant-pathogen interactions, the current paradigm posits that a specific defense response is activated upon recognition of pathogen effectors via interaction with their corresponding R proteins. According to the''Guard-Hypothesis,'' R proteins (the ``guards'') can sense modification of target molecules in the host (the ``guardees'') by pathogen effectors and subsequently trigger the defense response. Multiple studies have reported high genetic diversity at R genes maintained by balancing selection. In contrast, little is known about the evolutionary mechanisms shaping the guardee, which may be subject to contrasting evolutionary forces. Here we show that the evolution of the guardee RCR3 is characterized by gene duplication, frequent gene conversion, and balancing selection in the wild tomato species Solanum peruvianum. Investigating the functional characteristics of 54 natural variants through in vitro and in planta assays, we detected differences in recognition of the pathogen effector through interaction with the guardee, as well as substantial variation in the strength of the defense response. This variation is maintained by balancing selection at each copy of the RCR3 gene. Our analyses pinpoint three amino acid polymorphisms with key functional consequences for the coevolution between the guardee (RCR3) and its guard (Cf-2). We conclude that, in addition to coevolution at the ``guardee-effector'' interface for pathogen recognition, natural selection acts on the ``guard-guardee'' interface. Guardee evolution may be governed by a counterbalance between improved activation in the presence and prevention of auto-immune responses in the absence of the corresponding pathogen
Both Positive and Negative Selection Pressures Contribute to the Polymorphism Pattern of the Duplicated Human CYP21A2 Gene.
The human steroid 21-hydroxylase gene (CYP21A2) participates in cortisol and aldosterone biosynthesis, and resides together with its paralogous (duplicated) pseudogene in a multiallelic copy number variation (CNV), called RCCX CNV. Concerted evolution caused by non-allelic gene conversion has been described in great ape CYP21 genes, and the same conversion activity is responsible for a serious genetic disorder of CYP21A2, congenital adrenal hyperplasia (CAH). In the current study, 33 CYP21A2 haplotype variants encoding 6 protein variants were determined from a European population. CYP21A2 was shown to be one of the most diverse human genes (HHe=0.949), but the diversity of intron 2 was greater still. Contrary to previous findings, the evolution of intron 2 did not follow concerted evolution, although the remaining part of the gene did. Fixed sites (different fixed alleles of sites in human CYP21 paralogues) significantly accumulated in intron 2, indicating that the excess of fixed sites was connected to the lack of effective non-allelic conversion and concerted evolution. Furthermore, positive selection was presumably focused on intron 2, and possibly associated with the previous genetic features. However, the positive selection detected by several neutrality tests was discerned along the whole gene. In addition, the clear signature of negative selection was observed in the coding sequence. The maintenance of the CYP21 enzyme function is critical, and could lead to negative selection, whereas the presumed gene regulation altering steroid hormone levels via intron 2 might help fast adaptation, which broadly characterizes the genes of human CNVs responding to the environment
Fully Bayesian tests of neutrality using genealogical summary statistics
<p>Abstract</p> <p>Background</p> <p>Many data summary statistics have been developed to detect departures from neutral expectations of evolutionary models. However questions about the neutrality of the evolution of genetic loci within natural populations remain difficult to assess. One critical cause of this difficulty is that most methods for testing neutrality make simplifying assumptions simultaneously about the mutational model and the population size model. Consequentially, rejecting the null hypothesis of neutrality under these methods could result from violations of either or both assumptions, making interpretation troublesome.</p> <p>Results</p> <p>Here we harness posterior predictive simulation to exploit summary statistics of both the data and model parameters to test the goodness-of-fit of standard models of evolution. We apply the method to test the selective neutrality of molecular evolution in non-recombining gene genealogies and we demonstrate the utility of our method on four real data sets, identifying significant departures of neutrality in human influenza A virus, even after controlling for variation in population size.</p> <p>Conclusion</p> <p>Importantly, by employing a full model-based Bayesian analysis, our method separates the effects of demography from the effects of selection. The method also allows multiple summary statistics to be used in concert, thus potentially increasing sensitivity. Furthermore, our method remains useful in situations where analytical expectations and variances of summary statistics are not available. This aspect has great potential for the analysis of temporally spaced data, an expanding area previously ignored for limited availability of theory and methods.</p
A revised evolutionary history of the CYP1A subfamily : gene duplication, gene conversion, and positive selection
Author Posting. © The Authors, 2005. This is the author's version of the work. It is posted here by permission of Springer for personal use, not for redistribution. The definitive version was published in Journal of Molecular Evolution 62 (2006): 708-717, doi:10.1007/s00239-005-0134-z.Members of cytochrome P450 subfamily 1A (CYP1As) are involved in detoxification and bioactivation of common environmental pollutants. Understanding the functional evolution of these genes is essential to predicting and interpreting species differences in sensitivity to toxicity by such chemicals. The CYP1A gene subfamily comprises a single ancestral representative in most fish species and two paralogs in higher vertebrates, including birds and mammals. Phylogenetic analysis of complete coding sequences suggests that mammalian and bird paralog pairs (CYP1A1/2 and CYP1A4/5, respectively) are the result of independent gene duplication events. However, comparison of vertebrate genome sequences revealed that CYP1A genes lie within an extended region of conserved fine-scale synteny, suggesting that avian and mammalian CYP1A paralogs share a common genomic history. Algorithms designed to detect recombination between nucleotide sequences indicate that gene conversion has homogenized most of the length of the chicken CYP1A genes, as well as the 5’ end of mammalian CYP1As. Together, these data indicate that avian and mammalian CYP1A paralog pairs resulted from a single gene duplication event and that extensive gene conversion is responsible for the exceptionally high degree of sequence similarity between CYP1A4 and CYP1A5. Elevated non-synonymous/synonymous substitution ratios within a putatively unconverted stretch of ~250 bp suggests that positive selection may have reduced the effective rate of gene conversion in this region, which contains two substrate recognition sites. This work significantly alters our understanding of functional evolution in the CYP1A subfamily, suggesting that gene conversion and positive selection have been the dominant processes of sequence evolution.Funding for this work was provided by the NIH Superfund Basic Research Program at Boston University (5-P42-ES-07381) and by the Woods Hole Oceanographic Institution
Drosophila Duplication Hotspots Are Associated with Late-Replicating Regions of the Genome
Duplications play a significant role in both extremes of the phenotypic spectrum of newly arising mutations: they can have severe deleterious effects (e.g. duplications underlie a variety of diseases) but can also be highly advantageous. The phenotypic potential of newly arisen duplications has stimulated wide interest in both the mutational and selective processes shaping these variants in the genome. Here we take advantage of the Drosophila simulans–Drosophila melanogaster genetic system to further our understanding of both processes. Regarding mutational processes, the study of two closely related species allows investigation of the potential existence of shared duplication hotspots, and the similarities and differences between the two genomes can be used to dissect its underlying causes. Regarding selection, the difference in the effective population size between the two species can be leveraged to ask questions about the strength of selection acting on different classes of duplications. In this study, we conducted a survey of duplication polymorphisms in 14 different lines of D. simulans using tiling microarrays and combined it with an analogous survey for the D. melanogaster genome. By integrating the two datasets, we identified duplication hotspots conserved between the two species. However, unlike the duplication hotspots identified in mammalian genomes, Drosophila duplication hotspots are not associated with sequences of high sequence identity capable of mediating non-allelic homologous recombination. Instead, Drosophila duplication hotspots are associated with late-replicating regions of the genome, suggesting a link between DNA replication and duplication rates. We also found evidence supporting a higher effectiveness of selection on duplications in D. simulans than in D. melanogaster. This is also true for duplications segregating at high frequency, where we find evidence in D. simulans that a sizeable fraction of these mutations is being driven to fixation by positive selection
Pervasive Adaptive Protein Evolution Apparent in Diversity Patterns around Amino Acid Substitutions in Drosophila simulans
In Drosophila, multiple lines of evidence converge in suggesting that beneficial substitutions to the genome may be common. All suffer from confounding factors, however, such that the interpretation of the evidence—in particular, conclusions about the rate and strength of beneficial substitutions—remains tentative. Here, we use genome-wide polymorphism data in D. simulans and sequenced genomes of its close relatives to construct a readily interpretable characterization of the effects of positive selection: the shape of average neutral diversity around amino acid substitutions. As expected under recurrent selective sweeps, we find a trough in diversity levels around amino acid but not around synonymous substitutions, a distinctive pattern that is not expected under alternative models. This characterization is richer than previous approaches, which relied on limited summaries of the data (e.g., the slope of a scatter plot), and relates to underlying selection parameters in a straightforward way, allowing us to make more reliable inferences about the prevalence and strength of adaptation. Specifically, we develop a coalescent-based model for the shape of the entire curve and use it to infer adaptive parameters by maximum likelihood. Our inference suggests that ∼13% of amino acid substitutions cause selective sweeps. Interestingly, it reveals two classes of beneficial fixations: a minority (approximately 3%) that appears to have had large selective effects and accounts for most of the reduction in diversity, and the remaining 10%, which seem to have had very weak selective effects. These estimates therefore help to reconcile the apparent conflict among previously published estimates of the strength of selection. More generally, our findings provide unequivocal evidence for strongly beneficial substitutions in Drosophila and illustrate how the rapidly accumulating genome-wide data can be leveraged to address enduring questions about the genetic basis of adaptation
- …