253 research outputs found

    Identifying the favored mutation in a positive selective sweep.

    Get PDF
    Most approaches that capture signatures of selective sweeps in population genomics data do not identify the specific mutation favored by selection. We present iSAFE (for "integrated selection of allele favored by evolution"), a method that enables researchers to accurately pinpoint the favored mutation in a large region (∼5 Mbp) by using a statistic derived solely from population genetics signals. iSAFE does not require knowledge of demography, the phenotype under selection, or functional annotations of mutations

    A Markov blanket-based method for detecting causal SNPs in GWAS

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Detecting epistatic interactions associated with complex and common diseases can help to improve prevention, diagnosis and treatment of these diseases. With the development of genome-wide association studies (GWAS), designing powerful and robust computational method for identifying epistatic interactions associated with common diseases becomes a great challenge to bioinformatics society, because the study of epistatic interactions often deals with the large size of the genotyped data and the huge amount of combinations of all the possible genetic factors. Most existing computational detection methods are based on the classification capacity of SNP sets, which may fail to identify SNP sets that are strongly associated with the diseases and introduce a lot of false positives. In addition, most methods are not suitable for genome-wide scale studies due to their computational complexity.</p> <p>Results</p> <p>We propose a new Markov Blanket-based method, DASSO-MB (Detection of ASSOciations using Markov Blanket) to detect epistatic interactions in case-control GWAS. Markov blanket of a target variable T can completely shield T from all other variables. Thus, we can guarantee that the SNP set detected by DASSO-MB has a strong association with diseases and contains fewest false positives. Furthermore, DASSO-MB uses a heuristic search strategy by calculating the association between variables to avoid the time-consuming training process as in other machine-learning methods. We apply our algorithm to simulated datasets and a real case-control dataset. We compare DASSO-MB to other commonly-used methods and show that our method significantly outperforms other methods and is capable of finding SNPs strongly associated with diseases.</p> <p>Conclusions</p> <p>Our study shows that DASSO-MB can identify a minimal set of causal SNPs associated with diseases, which contains less false positives compared to other existing methods. Given the huge size of genomic dataset produced by GWAS, this is critical in saving the potential costs of biological experiments and being an efficient guideline for pathogenesis research.</p

    Identifying Signatures of Natural Selection in Tibetan and Andean Populations Using Dense Genome Scan Data

    Get PDF
    High-altitude hypoxia (reduced inspired oxygen tension due to decreased barometric pressure) exerts severe physiological stress on the human body. Two high-altitude regions where humans have lived for millennia are the Andean Altiplano and the Tibetan Plateau. Populations living in these regions exhibit unique circulatory, respiratory, and hematological adaptations to life at high altitude. Although these responses have been well characterized physiologically, their underlying genetic basis remains unknown. We performed a genome scan to identify genes showing evidence of adaptation to hypoxia. We looked across each chromosome to identify genomic regions with previously unknown function with respect to altitude phenotypes. In addition, groups of genes functioning in oxygen metabolism and sensing were examined to test the hypothesis that particular pathways have been involved in genetic adaptation to altitude. Applying four population genetic statistics commonly used for detecting signatures of natural selection, we identified selection-nominated candidate genes and gene regions in these two populations (Andeans and Tibetans) separately. The Tibetan and Andean patterns of genetic adaptation are largely distinct from one another, with both populations showing evidence of positive natural selection in different genes or gene regions. Interestingly, one gene previously known to be important in cellular oxygen sensing, EGLN1 (also known as PHD2), shows evidence of positive selection in both Tibetans and Andeans. However, the pattern of variation for this gene differs between the two populations. Our results indicate that several key HIF-regulatory and targeted genes are responsible for adaptation to high altitude in Andeans and Tibetans, and several different chromosomal regions are implicated in the putative response to selection. These data suggest a genetic role in high-altitude adaption and provide a basis for future genotype/phenotype association studies necessary to confirm the role of selection-nominated candidate genes and gene regions in adaptation to altitude

    Comparison of measures of marker informativeness for ancestry and admixture mapping

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Admixture mapping is a powerful gene mapping approach for an admixed population formed from ancestral populations with different allele frequencies. The power of this method relies on the ability of ancestry informative markers (AIMs) to infer ancestry along the chromosomes of admixed individuals. In this study, more than one million SNPs from HapMap databases and simulated data have been interrogated in admixed populations using various measures of ancestry informativeness: Fisher Information Content (FIC), Shannon Information Content (SIC), F statistics (F<sub>ST</sub>), Informativeness for Assignment Measure (I<sub>n</sub>), and the Absolute Allele Frequency Differences (delta, δ). The objectives are to compare these measures of informativeness to select SNP markers for ancestry inference, and to determine the accuracy of AIM panels selected by each measure in estimating the contributions of the ancestors to the admixed population.</p> <p>Results</p> <p>F<sub>ST </sub>and I<sub>n </sub>had the highest Spearman correlation and the best agreement as measured by Kappa statistics based on deciles. Although the different measures of marker informativeness performed comparably well, analyses based on the top 1 to 10% ranked informative markers of simulated data showed that I<sub>n </sub>was better in estimating ancestry for an admixed population.</p> <p>Conclusions</p> <p>Although millions of SNPs have been identified, only a small subset needs to be genotyped in order to accurately predict ancestry with a minimal error rate in a cost-effective manner. In this article, we compared various methods for selecting ancestry informative SNPs using simulations as well as SNP genotype data from samples of admixed populations and showed that the I<sub>n </sub>measure estimates ancestry proportion (in an admixed population) with lower bias and mean square error.</p

    Estimating Genetic Ancestry Proportions from Faces

    Get PDF
    Ethnicity can be a means by which people identify themselves and others. This type of identification mediates many kinds of social interactions and may reflect adaptations to a long history of group living in humans. Recent admixture in the US between groups from different continents, and the historically strong emphasis on phenotypic differences between members of these groups, presents an opportunity to examine the degree of concordance between estimates of group membership based on genetic markers and on visually-based estimates of facial features. We first measured the degree of Native American, European, African and East Asian genetic admixture in a sample of 14 self-identified Hispanic individuals, chosen to cover a broad range of Native American and European genetic admixture proportions. We showed frontal and side-view photographs of the 14 individuals to 241 subjects living in New Mexico, and asked them to estimate the degree of NA admixture for each individual. We assess the overall concordance for each observer based on an aggregated measure of the difference between the observer and the genetic estimates. We find that observers reach a significantly higher degree of concordance than expected by chance, and that the degree of concordance as well as the direction of the discrepancy in estimates differs based on the ethnicity of the observer, but not on the observers' age or sex. This study highlights the potentially high degree of discordance between physical appearance and genetic measures of ethnicity, as well as how perceptions of ethnic affiliation are context-specific. We compare our findings to those of previous studies and discuss their implications

    Genome-wide patterns of differentiation and spatially varying selection between postglacial recolonization lineages of Populus alba (Salicaceae), a widespread forest tree

    Get PDF
    Studying the divergence continuum in plants is relevant to fundamental and applied biology because of the potential to reveal functionally important genetic variation. In this context, whole-genome sequencing (WGS) provides the necessary rigour for uncovering footprints of selection. We resequenced populations of two divergent phylogeographic lineages of Populus alba (n = 48), thoroughly characterized by microsatellites (n = 317), and scanned their genomes for regions of unusually high allelic differentiation and reduced diversity using > 1.7 million single nucleotide polymorphisms (SNPs) from WGS. Results were confirmed by Sanger sequencing. On average, 9134 high-differentiation (≥ 4 standard deviations) outlier SNPs were uncovered between populations, 848 of which were shared by ≥ three replicate comparisons. Annotation revealed that 545 of these were located in 437 predicted genes. Twelve percent of differentiation outlier genome regions exhibited significantly reduced genetic diversity. Gene ontology (GO) searches were successful for 327 high-differentiation genes, and these were enriched for 63 GO terms. Our results provide a snapshot of the roles of ‘hard selective sweeps’ vs divergent selection of standing genetic variation in distinct postglacial recolonization lineages of P. alba. Thus, this study adds to our understanding of the mechanisms responsible for the origin of functionally relevant variation in temperate trees

    How Humans Differ from Other Animals in Their Levels of Morphological Variation

    Get PDF
    Animal species come in many shapes and sizes, as do the individuals and populations that make up each species. To us, humans might seem to show particularly high levels of morphological variation, but perhaps this perception is simply based on enhanced recognition of individual conspecifics relative to individual heterospecifics. We here more objectively ask how humans compare to other animals in terms of body size variation. We quantitatively compare levels of variation in body length (height) and mass within and among 99 human populations and 848 animal populations (210 species). We find that humans show low levels of within-population body height variation in comparison to body length variation in other animals. Humans do not, however, show distinctive levels of within-population body mass variation, nor of among-population body height or mass variation. These results are consistent with the idea that natural and sexual selection have reduced human height variation within populations, while maintaining it among populations. We therefore hypothesize that humans have evolved on a rugged adaptive landscape with strong selection for body height optima that differ among locations

    Haplotype differences for copy number variants in the 22q11.23 region among human populations: a pigmentation-based model for selective pressure.

    Get PDF
    Two gene clusters are tightly linked in a narrow region of chromosome 22q11.23: the macrophage migration inhibitory factor (MIF) gene family and the glutathione S-transferase theta class. Within 120 kb in this region, two 30-kb deletions reach high frequencies in human populations. This gives rise to four haplotypic arrangements, which modulate the number of genes in both families. The variable patterns of linkage disequilibrium (LD) between these copy number variants (CNVs) in diverse human populations remain poorly understood. We analyzed 2469 individuals belonging to 27 human populations with different ethnic origins. Then we correlated the genetic variability of 22q11.23 CNVs with environmental variables. We confirmed an increasing strength of LD from Africa to Asia and to Europe. Further, we highlighted strongly significant correlations between the frequency of one of the haplotypes and pigmentation-related variables: skin color (R2=0.675, P<0.001), distance from the equator (R2=0.454, P<0.001), UVA radiation (R2=0.439, P<0.001), and UVB radiation (R2=0.313, P=0.002). The fact that all MIF-related genes are retained on this haplotype and the evidences gleaned from experimental systems seem to agree with the role of MIF-related genes in melanogenesis. As such, we propose a model that explains the geographic and ethnic distribution of 22q11.23 CNVs among human populations, assuming that MIF-related gene dosage could be associated with adaptation to low UV radiatio
    corecore