14 research outputs found

    Positive selection in humans : from singles to interaction maps

    No full text
    From Darwin’s Origin of the Species to the recent wealth in genomic data, many biologists have focused their research on understanding how natural selection has shaped the variability among and within species. Although theoretical and empirical advances have been remarkable, most biological mechanisms underlying the molecular basis of human adaptation remain to be elucidated. The selectionist view of adaptation accounted for the bias towards independent gene evolution. Most published studies aiming at detecting positive selection using either polymorphism or divergence data have been performed using a gene-candidate or a genome-wide scan approach, as described in the two first articles presented here. However, gene evolution is largely influenced by the biological context in which the encoded protein performs its intrinsic function(s). The phenotype, not the genotype, is at the interface with natural selection. Thus, in order to understand gene evolution, and particularly when considering adaptive selection, it is crucial to reduce the gap between genotype and phenotype. Genes and proteins do not act in isolation, but rather interact one with others in order to perform a given biological function. Therefore, when studying natural selection at molecular level one promising framework is to consider gene networks, as described in the two last articles of the present thesis. Analyses of gene networks describing the Insulin/TOR transduction signalling cascade and the whole protein-protein physical interaction map hold very striking results. Namely, genes acting at the core of both networks, thus having either more effect on a given phenotype or more pleiotropic effects within the organism, are more likely to be targeted by recent positive selection, as inferred using polymorphism data.Desde el “Origen de las Especies” de Darwin a la reciente revoluci´on gen´omica, muchos bi´ologos han centrado su investigaci´on en la comprensi ´on de c´omo la selecci´on natural ha dado forma a la variabilidad entre y dentro de las especies. Aunque, los avances te´oricos y emp´ıricos han sido notables, la mayor´ıa de los mecanismos biol´ogicos que subyacen a las bases moleculares de la adaptaci´on biol´ogica a´un no est´an suficientemente esclarecidos. La visi´on seleccionista de adaptaci´on marc´o el sesgo de los estudios evolutivos hacia el an´alisis de genes individuales. La mayor´ıa de estudios publicados destinados a la detecci´on de la selecci´on positiva utilizando datos de polimorfismo o de divergencia se han realizado utilizando un gen candidato o un enfoque de exploraci´on gen´omica, como se describe en los dos primeros art´ıculos presentados en la presente tesis. Sin embargo, la evoluci´on de genes est´a muy condicionada por el contexto biol´ogico en el que cada gen realiza su funci´on intr´ınseca, siendo el fenotipo, y no el genotipo, su materia primaria. Por lo tanto, a fin de comprender la evoluci´on de genes, y en particular cuando se considera la evoluci´on adaptativa, es crucial reducir la brecha entre el genotipo y el fenotipo. Los genes y las prote´ınas no act´uan de manera aislada, sino que interact´uan entre s´ı con el fin de realizar una funci´on biol´ogica determinada. Por lo tanto, un marco prometedor al estudiar la selecci´on natural a nivel molecular seria considerar las redes de genes, como se describe en los dos ´ultimos art´ıculos de la presente tesis. Los an´alisis de los datos de polimorfismo gen´etico, tanto de los genes que componen la v´ıa de la insulina, c´omo de los todos los genes descritos en los mapas f´ısicos de interacci´on prote´ına-prote´ına tienen resultados muy sorprendentes: los genes que act´uan en el n´ucleo de ambas redes, teniendo as´ı m´as efecto sobre un determinado fenotipo o m´as efectos ple´otropicos dentro del organismo, tienen m´as probabilidades de ser el blanco de la selecci´on positiva reciente

    Recent positive selection has acted on genes encoding proteins with more interactions within the whole human interactome

    No full text
    Genes vary in their likelihood to undergo adaptive evolution. The genomic factors that determine adaptability, however, remain poorly understood. Genes function in the context of molecular networks, with some occupying more important positions than others and thus being likely to be under stronger selective pressures. However, how positive selection distributes across the different parts of molecular networks is still not fully understood. Here, we inferred positive selection using comparative genomics and population genetics approaches through the comparison of 10 mammalian and 270 human genomes, respectively. In agreement with previous results, we found that genes with lower network centralities are more likely to evolve under positive selection (as inferred from divergence data). Surprisingly, polymorphism data yield results in the opposite direction than divergence data: Genes with higher centralities are more likely to have been targeted by recent positive selection during recent human evolution. Our results indicate that the relationship between centrality and the impact of adaptive evolution highly depends on the mode of positive selection and/or the evolutionary time-scale.This work was funded by the “Ministerio de Ciencia y Tecnología” (Spain) (grant BFU2013-43726-P), and the “Direcció General de Recerca, Generalitat de Catalunya (Grup de Recerca Consolidat 2009 SGR 1101)” awarded to J.B. P.L. was supported by a Ph.D. fellowship from “Acción Estratégica de Salud, en el marco del Plan Nacional de Investigación Científica, Desarrollo e Innovación Tecnológica 2008–2011” from Instituto de Salud Carlos III. D.A.-P. was a “Juan de la Cierva” fellow from the “Ministerio de Economía y Competitividad” (Spain) (JCI-2011-11089). M.A.F. was supported by a Principal Investigator grant from Science Foundation Ireland (12/IP/1673) and a project from the “Ministerio de Economía y Competitividad” (grant number BFU2012-36346)

    Recent positive selection has acted on genes encoding proteins with more interactions within the whole human interactome

    No full text
    Genes vary in their likelihood to undergo adaptive evolution. The genomic factors that determine adaptability, however, remain poorly understood. Genes function in the context of molecular networks, with some occupying more important positions than others and thus being likely to be under stronger selective pressures. However, how positive selection distributes across the different parts of molecular networks is still not fully understood. Here, we inferred positive selection using comparative genomics and population genetics approaches through the comparison of 10 mammalian and 270 human genomes, respectively. In agreement with previous results, we found that genes with lower network centralities are more likely to evolve under positive selection (as inferred from divergence data). Surprisingly, polymorphism data yield results in the opposite direction than divergence data: Genes with higher centralities are more likely to have been targeted by recent positive selection during recent human evolution. Our results indicate that the relationship between centrality and the impact of adaptive evolution highly depends on the mode of positive selection and/or the evolutionary time-scale.This work was funded by the “Ministerio de Ciencia y Tecnología” (Spain) (grant BFU2013-43726-P), and the “Direcció General de Recerca, Generalitat de Catalunya (Grup de Recerca Consolidat 2009 SGR 1101)” awarded to J.B. P.L. was supported by a Ph.D. fellowship from “Acción Estratégica de Salud, en el marco del Plan Nacional de Investigación Científica, Desarrollo e Innovación Tecnológica 2008–2011” from Instituto de Salud Carlos III. D.A.-P. was a “Juan de la Cierva” fellow from the “Ministerio de Economía y Competitividad” (Spain) (JCI-2011-11089). M.A.F. was supported by a Principal Investigator grant from Science Foundation Ireland (12/IP/1673) and a project from the “Ministerio de Economía y Competitividad” (grant number BFU2012-36346)

    Hierarchical boosting: a machine-learning framework to detect and classify hard selective sweeps in human populations

    No full text
    MOTIVATION: Detecting positive selection in genomic regions is a recurrent topic in natural population genetic studies. However, there is little consistency among the regions detected in several genome-wide scans using different tests and/or populations. Furthermore, few methods address the challenge of classifying selective events according to specific features such as age, intensity or state (completeness). RESULTS: We have developed a machine-learning classification framework that exploits the combined ability of some selection tests to uncover different polymorphism features expected under the hard sweep model, while controlling for population-specific demography. As a result, we achieve high sensitivity toward hard selective sweeps while adding insights about their completeness (whether a selected variant is fixed or not) and age of onset. Our method also determines the relevance of the individual methods implemented so far to detect positive selection under specific selective scenarios. We calibrated and applied the method to three reference human populations from The 1000 Genome Project to generate a genome-wide classification map of hard selective sweeps. This study improves detection of selective sweep by overcoming the classical selection versus no-selection classification strategy, and offers an explanation to the lack of consistency observed among selection tests when applied to real data. Very few signals were observed in the African population studied, while our method presents higher sensitivity in this population demography. AVAILABILITY AND IMPLEMENTATION: The genome-wide results for three human populations from The 1000 Genomes Project and an R-package implementing the 'Hierarchical Boosting' framework are available at http://hsb.upf.edu/.This work was supported by Ministerio de Economía y Competitividad (Spain) [grants BFU2010-19443, BFU2013-43726-P]; and the Secretaria d’Universitats i Recerca del Departament d’Economia i Coneixement de la Generalitat de Catalunya [GRC 2014 SGR 866] to J.B. M.P. and G.D. have been supported by a grant of the FPI program, Ministerio de Economia y Competitividad; P.L. by a grant from the Instituto de Salud Carlos III; J.E. was supported through a Postdoc scholarship from the Volkswagenstiftung [Az: I/85 198

    Hierarchical boosting: a machine-learning framework to detect and classify hard selective sweeps in human populations

    No full text
    MOTIVATION: Detecting positive selection in genomic regions is a recurrent topic in natural population genetic studies. However, there is little consistency among the regions detected in several genome-wide scans using different tests and/or populations. Furthermore, few methods address the challenge of classifying selective events according to specific features such as age, intensity or state (completeness). RESULTS: We have developed a machine-learning classification framework that exploits the combined ability of some selection tests to uncover different polymorphism features expected under the hard sweep model, while controlling for population-specific demography. As a result, we achieve high sensitivity toward hard selective sweeps while adding insights about their completeness (whether a selected variant is fixed or not) and age of onset. Our method also determines the relevance of the individual methods implemented so far to detect positive selection under specific selective scenarios. We calibrated and applied the method to three reference human populations from The 1000 Genome Project to generate a genome-wide classification map of hard selective sweeps. This study improves detection of selective sweep by overcoming the classical selection versus no-selection classification strategy, and offers an explanation to the lack of consistency observed among selection tests when applied to real data. Very few signals were observed in the African population studied, while our method presents higher sensitivity in this population demography. AVAILABILITY AND IMPLEMENTATION: The genome-wide results for three human populations from The 1000 Genomes Project and an R-package implementing the 'Hierarchical Boosting' framework are available at http://hsb.upf.edu/.This work was supported by Ministerio de Economía y Competitividad (Spain) [grants BFU2010-19443, BFU2013-43726-P]; and the Secretaria d’Universitats i Recerca del Departament d’Economia i Coneixement de la Generalitat de Catalunya [GRC 2014 SGR 866] to J.B. M.P. and G.D. have been supported by a grant of the FPI program, Ministerio de Economia y Competitividad; P.L. by a grant from the Instituto de Salud Carlos III; J.E. was supported through a Postdoc scholarship from the Volkswagenstiftung [Az: I/85 198

    Positive selection in the chromosome 16 VKORC1 genomic region has contributed to the variability of anticoagulant response in humans

    No full text
    VKORC1 (vitamin K epoxide reductase complex subunit 1, 16p11.2) is the main genetic determinant of human response to oral anticoagulants of antivitamin K type (AVK). This gene was recently suggested to be a putative target of positive selection in East Asian populations. In this study, we genotyped the HGDP-CEPH Panel for six VKORC1 SNPs and downloaded chromosome 16 genotypes from the HGDP-CEPH database in order to characterize the geographic distribution of footprints of positive selection within and around this locus. A unique VKORC1 haplotype carrying the promoter mutation associated with AVK sensitivity showed especially high frequencies in all the 17 HGDP-CEPH East Asian population samples. VKORC1 and 24 neighboring genes were found to lie in a 505 kb region of strong linkage disequilibrium in these populations. Patterns of allele frequency differentiation and haplotype structure suggest that this genomic region has been submitted to a near complete selective sweep in all East Asian populations and only in this geographic area. The most extreme scores of the different selection tests are found within a smaller 45 kb region that contains VKORC1 and three other genes (BCKDK, MYST1 (KAT8), and PRSS8) with different functions. Because of the strong linkage disequilibrium, it is not possible to determine if VKORC1 or one of the three other genes is the target of this strong positive selection that could explain present-day differences among human populations in AVK dose requirement. Our results show that the extended region surrounding a presumable single target of positive selection should be analyzed for genetic variation in a wide range of genetically diverse populations in order to account for other neighboring and confounding selective events and the hitchhiking effect.This work was supported by the Spanish National Institute for Bioinformatics (www.inab.org). PL is supported by a PhD fellowship from ‘‘Acción Estratégica de Salud, en el Marco del Plan Nacional de Investigación Científica, Desarrollo e Innovación Tecnológica 2008–2011’

    Distribution of events of positive selection and population differentiation in a metabolic pathway: the case of asparagine N-glycosylation

    Get PDF
    Asparagine N-Glycosylation is one of the most important forms of protein post-translational modification in eukaryotes. This metabolic pathway can be subdivided into two parts: an upstream sub-pathway required for achieving proper folding for most of the proteins synthesized in the secretory pathway, and a downstream sub-pathway required to give variability to trans-membrane proteins, and involved in adaptation to the environment and innate immunity. Here we analyze the nucleotide variability of the genes of this pathway in human populations, identifying which genes show greater population differentiation and which genes show signatures of recent positive selection. We also compare how these signals are distributed between the upstream and the downstream parts of the pathway, with the aim of exploring how forces of population differentiation and positive selection vary among genes involved in the same metabolic pathway but subject to different functional constraints. Our results show that genes in the downstream part of the pathway are more likely to show a signature of population differentiation, while events of positive selection are equally distributed among the two parts of the pathway. Moreover, events of positive selection are frequent on genes that are known to be at bifurcation points, and that are identified as being in key position by a network-level analysis such as MGAT3 and GCS1. These findings indicate that the upstream part of the Asparagine N-Glycosylation pathway has lower diversity among populations, while the downstream part is freer to tolerate diversity among populations. Moreover, the distribution of signatures of population differentiation and positive selection can change between parts of a pathway, especially between parts that are exposed to different functional constraints. Our results support the hypothesis that genes involved in constitutive processes can be expected to show lower population differentiation, while genes involved in traits related to the environment should show higher variability. Taken together, this work broadens our knowledge on how events of population differentiation and of positive selection are distributed among different parts of a metabolic pathway.This work was funded by grant BFU2010-19443 (subprogram BMC) awarded to JB by Ministerio de Ciencia y Tecnología (Spain), and the Direcció General de Recerca, Generalitat de Catalunya (Grup de Recerca Consolidat 2009 SGR 1101). GMD is supported by a FPI fellowship (BES-2009-017731) from the Ministerio de Ciencia y Tecnología, (Spain). PL is supported by a PhD fellowship from “Acción Estratégica de Salud, 2008-2011” from Instituto de Salud Carlos III and LM is supported by a postdoctoral fellowship from the Juan de la Cierva Program of the Spanish Ministry of Science and Innovation (MICINN)

    1000 Genomes Selection Browser 1.0: A genome browser dedicated to signatures of natural selection in modern humans

    No full text
    Searching for Darwinian selection in natural populations has been the focus of a multitude of studies over the last decades. Here we present the 1000 Genomes Selection Browser 1.0 (http://hsb.upf.edu) as a resource for signatures of recent natural selection in modern humans. We have implemented and applied a large number of neutrality tests as well as summary statistics informative for the action of selection such as Tajima’s D, CLR, Fay and Wu’s H, Fu and Li’s F* and D*, XPEHH, ΔiHH, iHS, FST, ΔDAF and XPCLR among others to low coverage sequencing data from the 1000 genomes project (Phase 1; release April 2012). We have implemented a publicly available genome-wide browser to communicate the results from three different populations of West African, Northern European and East Asian ancestry (YRI, CEU, CHB). Information is provided in UCSC-style format to facilitate the integration with the rich UCSC browser tracks and an access page is provided with instructions and for convenient visualization. We believe that this expandable resource will facilitate the interpretation of signals of selection on different temporal, geographical and genomic scales.Ministerio de Ciencia y Tecnología (Spain); Direcció General de Recerca, Generalitat de Catalunya (Grup de Recerca Consolidat 2009 SGR 1101); Subprogram BMC[BFU2010-19443 awarded to J.B.]; Post-doctoral scholarship from the Volkswagenstiftung [Az:I/85 198 to J.E.]; Spanish government [BFU-2008-01046; SAF2011-29239];The Spanish government FPI scholarships [BES-2009-017731 and BES-2011-04502 to G.M.D. and M.P.,respectively]; PhD fellowship from ‘Acción Estratégica de Salud, en el marco del Plan Nacional de Investigación Científica, Desarrollo e Innovación Tecnológica 2008-2011’ from Instituto de Salud Carlos III (to P.L.). Funding for open access charge: Prof. Jaume Bertranpetit

    Human genomic regions with exceptionally high levels of population differentiation identified from 911 whole-genome sequences

    No full text
    Background: Population differentiation has proved to be effective for identifying loci under geographically localized positive selection, and has the potential to identify loci subject to balancing selection. We have previously investigated the pattern of genetic differentiation among human populations at 36.8 million genomic variants to identify sites in the genome showing high frequency differences. Here, we extend this dataset to include additional variants, survey sites with low levels of differentiation, and evaluate the extent to which highly differentiated sites are likely to result from selective or other processes. Results: We demonstrate that while sites with low differentiation represent sampling effects rather than balancing selection, sites showing extremely high population differentiation are enriched for positive selection events and that one half may be the result of classic selective sweeps. Among these, we rediscover known examples, where we actually identify the established functional SNP, and discover novel examples including the genes ABCA12, CALD1 and ZNF804, which we speculate may be linked to adaptations in skin, calcium metabolism and defense, respectively. Conclusions: We identify known and many novel candidate regions for geographically restricted positive selection, and suggest several directions for further research.This work was supported by The Wellcome Trust (098051), an Italian National Research Council (CNR) short-term mobility fellowship from the 2013 program to VC, and an EMBO Short Term Fellowship ASTF 324–2010 to V

    Distribution of events of positive selection and population differentiation in a metabolic pathway: the case of asparagine N-glycosylation

    No full text
    Asparagine N-Glycosylation is one of the most important forms of protein post-translational modification in eukaryotes. This metabolic pathway can be subdivided into two parts: an upstream sub-pathway required for achieving proper folding for most of the proteins synthesized in the secretory pathway, and a downstream sub-pathway required to give variability to trans-membrane proteins, and involved in adaptation to the environment and innate immunity. Here we analyze the nucleotide variability of the genes of this pathway in human populations, identifying which genes show greater population differentiation and which genes show signatures of recent positive selection. We also compare how these signals are distributed between the upstream and the downstream parts of the pathway, with the aim of exploring how forces of population differentiation and positive selection vary among genes involved in the same metabolic pathway but subject to different functional constraints. Our results show that genes in the downstream part of the pathway are more likely to show a signature of population differentiation, while events of positive selection are equally distributed among the two parts of the pathway. Moreover, events of positive selection are frequent on genes that are known to be at bifurcation points, and that are identified as being in key position by a network-level analysis such as MGAT3 and GCS1. These findings indicate that the upstream part of the Asparagine N-Glycosylation pathway has lower diversity among populations, while the downstream part is freer to tolerate diversity among populations. Moreover, the distribution of signatures of population differentiation and positive selection can change between parts of a pathway, especially between parts that are exposed to different functional constraints. Our results support the hypothesis that genes involved in constitutive processes can be expected to show lower population differentiation, while genes involved in traits related to the environment should show higher variability. Taken together, this work broadens our knowledge on how events of population differentiation and of positive selection are distributed among different parts of a metabolic pathway.This work was funded by grant BFU2010-19443 (subprogram BMC) awarded to JB by Ministerio de Ciencia y Tecnología (Spain), and the Direcció General de Recerca, Generalitat de Catalunya (Grup de Recerca Consolidat 2009 SGR 1101). GMD is supported by a FPI fellowship (BES-2009-017731) from the Ministerio de Ciencia y Tecnología, (Spain). PL is supported by a PhD fellowship from “Acción Estratégica de Salud, 2008-2011” from Instituto de Salud Carlos III and LM is supported by a postdoctoral fellowship from the Juan de la Cierva Program of the Spanish Ministry of Science and Innovation (MICINN)
    corecore