35 research outputs found

    An approximate Markov model for the wright-fisher diffusion and its application to time series data

    Get PDF
    The joint and accurate inference of selection and demography from genetic data is considered a particularly challenging question in population genetics, since both process may lead to very similar patterns of genetic diversity. However, additional information for disentangling these effects may be obtained by observing changes in allele frequencies over multiple time points. Such data is common in experimental evolution studies, as well as in the comparison of ancient and contemporary samples. Leveraging this information, however, has been computationally challenging, particularly when considering multi-locus data sets. To overcome these issues, we introduce a novel, discrete approximation for diffusion processes, termed mean transition time approximation, which preserves the long-term behavior of the underlying continuous diffusion process. We then derive this approximation for the particular case of inferring selection and demography from time series data under the classic Wright- Fisher model and demonstrate that our approximation is well suited to describe allele trajectories through time, even when only a few states are used. We then develop a Bayesian inference approach to jointly infer the population size and locus-specific selection coefficients with high accuracy, and further extend this model to also infer the rates of sequencing errors and mutations. We finally apply our approach to recent experimental data on the evolution of drug resistance in Influenza virus, identifying likely targets of selection and finding evidence for much larger viral population sizes than previously reported

    Decay of linkage disequilibrium within genes across HGDP-CEPH human samples: most population isolates do not show increased LD

    Get PDF
    9 pages, 2 figures, 4 additional files.[Background] It is well known that the pattern of linkage disequilibrium varies between human populations, with remarkable geographical stratification. Indirect association studies routinely exploit linkage disequilibrium around genes, particularly in isolated populations where it is assumed to be higher. Here, we explore both the amount and the decay of linkage disequilibrium with physical distance along 211 gene regions, most of them related to complex diseases, across 39 HGDP-CEPH population samples, focusing particularly on the populations defined as isolates. Within each gene region and population we use r2 between all possible single nucleotide polymorphism (SNP) pairs as a measure of linkage disequilibrium and focus on the proportion of SNP pairs with r2 greater than 0.8.[Results] Although the average r2 was found to be significantly different both between and within continental regions, a much higher proportion of r2 variance could be attributed to differences between continental regions (2.8% vs. 0.5%, respectively). Similarly, while the proportion of SNP pairs with r2 > 0.8 was significantly different across continents for all distance classes, it was generally much more homogenous within continents, except in the case of Africa and the Americas. The only isolated populations with consistently higher LD in all distance classes with respect to their continent are the Kalash (Central South Asia) and the Surui (America). Moreover, isolated populations showed only slightly higher proportions of SNP pairs with r2 > 0.8 per gene region than non-isolated populations in the same continent. Thus, the number of SNPs in isolated populations that need to be genotyped may be only slightly less than in non-isolates.[Conclusion] The "isolated population" label by itself does not guarantee a greater genotyping efficiency in association studies, and properties other than increased linkage disequilibrium may make these populations interesting in genetic epidemiology.This research was supported by "Fundación Genoma España" (proyectos piloto CEGEN 2004–2005), Dirección General de Investigación, Ministerio de Educación y Ciencia of Spain (grants BFU2005-00243, BFU2006-01235, BFU2006-15413-CO2-01, SEJ2006-13537) and Direcció General de Recerca, Generalitat de Catalunya (2005SGR00608). SNP genotyping services were provided by the Spanish "Centro Nacional de Genotipado"Peer reviewe

    Signatures of Environmental Genetic Adaptation Pinpoint Pathogens as the Main Selective Pressure through Human Evolution

    Get PDF
    Previous genome-wide scans of positive natural selection in humans have identified a number of non-neutrally evolving genes that play important roles in skin pigmentation, metabolism, or immune function. Recent studies have also shown that a genome-wide pattern of local adaptation can be detected by identifying correlations between patterns of allele frequencies and environmental variables. Despite these observations, the degree to which natural selection is primarily driven by adaptation to local environments, and the role of pathogens or other ecological factors as selective agents, is still under debate. To address this issue, we correlated the spatial allele frequency distribution of a large sample of SNPs from 55 distinct human populations to a set of environmental factors that describe local geographical features such as climate, diet regimes, and pathogen loads. In concordance with previous studies, we detected a significant enrichment of genic SNPs, and particularly non-synonymous SNPs associated with local adaptation. Furthermore, we show that the diversity of the local pathogenic environment is the predominant driver of local adaptation, and that climate, at least as measured here, only plays a relatively minor role. While background demography by far makes the strongest contribution in explaining the genetic variance among populations, we detected about 100 genes which show an unexpectedly strong correlation between allele frequencies and pathogenic environment, after correcting for demography. Conversely, for diet regimes and climatic conditions, no genes show a similar correlation between the environmental factor and allele frequencies. This result is validated using low-coverage sequencing data for multiple populations. Among the loci targeted by pathogen-driven selection, we found an enrichment of genes associated to autoimmune diseases, such as celiac disease, type 1 diabetes, and multiples sclerosis, which lends credence to the hypothesis that some susceptibility alleles for autoimmune diseases may be maintained in human population due to past selective processes

    Human genetic diversity in genes related to host-pathogen interactions

    Get PDF
    La tesi que teniu a les mans recull quatre treballs amb un objectiu comú; determinar si els patògens (virus, bacteris, paràsits.) han exercit pressions selectives sobre els genomes dels seus hostes (com per exemple els humans).Sabent que la detecció de l'empremta de la selecció permet identificar aquelles regions del genoma que han estat rellevants al llarg de l'evolució d'una espècie, ja que a nivell local és la variació funcional qui acaba essent objecte de la selecció, ens hem disposat a estudiar els possibles senyals de selecció en gens relacionats amb la interacció hoste-patògen. En concret, hem analitzat gens que codifiquen per: a) components del sistema immunitari innat i, b) enzims de glicosilació, la majoria dels quals s'inclouen en quatre de les principals rutes biosintètiques de glicans, en diferents poblacions humanes.Com a conclusió principal; ambdós conjunts de gens mostren clars senyals de selecció. A més hem vist que segons el context biològic on és troben certs gens és veuen més afectats per l'acció de la selecció natural.The present thesis includes four studies with a common objective: determining whether pathogens (virus, bacteria, parasites.) have exerted selective pressures on the genome of their hosts (for example, humans).Detecting signatures of positive selection is a useful tool to identify functionally relevant genomic regions since selection locally shapes the functional variation. Based on this premise, we have studied the possible signatures of selection in genes related to host-pathogen interactions. Specifically, we have analyzed those genes encoding: a) components of the innate immunity response; and ii) glycosylation enzymes most of them involved in four major glycan biosynthesis pathways, in different human populations.The main conclusion obtained from these studies is that both studied gene categories show clear signatures of selection. Moreover, we have determined that according to their biological context certain genes are more prone to the action of selection

    On Detecting Incomplete Soft or Hard Selective Sweeps Using Haplotype Structure

    Get PDF
    We present a new haplotype-based statistic (nS(L)) for detecting both soft and hard sweeps in population genomic data from a single population. We compare our new method with classic single-population haplotype and site frequency spectrum (SFS)-based methods and show that it is more robust, particularly to recombination rate variation. However, all statistics show some sensitivity to the assumptions of the demographic model. Additionally, we show that nS(L) has at least as much power as other methods under a number of different selection scenarios, most notably in the cases of sweeps from standing variation and incomplete sweeps. This conclusion holds up under a variety of demographic models. In many aspects, our new method is similar to the iHS statistic; however, it is generally more robust and does not require a genetic map. To illustrate the utility of our new method, we apply it to HapMap3 data and show that in the Yoruban population, there is strong evidence of selection on genes relating to lipid metabolism. This observation could be related to the known differences in cholesterol levels, and lipid metabolism more generally, between African Americans and other populations. We propose that the underlying causes for the selection on these genes are pleiotropic effects relating to blood parasites rather than their role in lipid metabolism

    Evolutionary analysis of genes of two pathways involved in placental malaria infection

    No full text
    Placental malaria is a special form of malaria that causes up to 200,000 maternal and infant deaths every year. Previous studies show that two receptor molecules, hyaluronic acid and chondroitin sulphate A, are mediating the adhesion of parasite-infected erythrocytes in the placenta of patients, which is believed to be a key step in the pathogenesis of the disease. In this study, we aimed at identifying sites of malaria-induced adaptation by scanning for signatures of natural selection in 24 genes in the complete biosynthesis pathway of these two receptor molecules. We analyzed a total of 24 Mb of publicly available polymorphism data from the International HapMap project for three human populations with European, Asian and African ancestry, with the African population from a region of presently and historically high malaria prevalence. Using the methods based on allele frequency distributions, genetic differentiation between populations, and on long-range haplotype structure, we found only limited evidence for malaria-induced genetic adaptation in this set of genes in the African population; however, we identified one candidate gene with clear evidence of selection in the Asian population. Although historical exposure to malaria in this population cannot be ruled out, we speculate that it might be caused by other pathogens, as there is growing evidence that these molecules are important receptors in a variety of host-pathogen interactions. We propose to use the present methods in a systematic way to help identify candidate regions under positive selection as a consequence of malaria

    A Natural History of FUT2 Polymorphism in Humans

    No full text
    11 páginas, 2 figuras, 3 tablas.Because pathogens are powerful selective agents, host-cell surface molecules used by pathogens as identification signals can reveal the signature of selection. Most of them are oligosaccharides, synthesized by glycosyltransferases. One known example is balancing selection shaping ABO evolution as a consequence of both, A and B antigens being recognized as receptors by some pathogens, and anti-A and/or anti-B natural antibodies produced by hosts conferring protection against the numerous infectious agents expressing A and B motifs. These antigens can also be found in tissues other than blood if there is activity of another enzyme, FUT2, a fucosyltransferase responsible for ABO biosynthesis in body fluids. Homozygotes for null variants at this locus present the nonsecretor phenotype (se), because they cannot express ABO antigens in secretions. Multiple independent mutations have been shown to be responsible for the nonsecretor phenotype, which is coexisting with the secretor phenotype in most populations. In this study, we have resequenced the coding region of FUT2 in 732 individuals from 39 worldwide human populations. We report a complex pattern of natural selection acting on the gene. Although frequencies of secretor and nonsecretor phenotypes are similar in different populations, the point mutations at the base of the phenotypes are different, with some variants showing a long history of balancing selection among Eurasian and African populations, and one recent variant showing a fast spread in East Asia, likely due to positive selection. Thus, a convergent phenotype composition has been achieved through different mutations with different evolutionary histories.This research was funded by grants BFU2005-00243 and SAF-2007-63171 awarded by Ministerio de Educación y Ciencia (Spain) and by the Direcció General de Recerca of Generalitat de Catalunya (Grup de Recerca Consolidat 2005SGR/00608). Funds were also from the Etablissement Francxais du Sang (EFS) Centre Atlantique, and fromthe Ministère Francxais de la Recherche (EA3034). All the sequencing was done at the Genomic Service, Universitat Pompeu Fabra; we thank Stéphanie Plaza and Roger Anglada for their help. Computational analysis was helped by the National Institute for Bioinformatics (www.inab.org), and SNP genotyping services were provided by the Spanish ‘‘Centro Nacional de Genotipado’’ (CEGEN; www.cegen.org); both are platforms of Genoma Espan˜a. A.F.-A. is supported by a PhDfellowship from UPF and M.S. from the Programa de becas FPU del Ministerio de Educación y Ciencia, Spain (AP2005-3982).Peer reviewe
    corecore