21,705 research outputs found

    High performance computation of landscape genomic models integrating local indices of spatial association

    Get PDF
    Since its introduction, landscape genomics has developed quickly with the increasing availability of both molecular and topo-climatic data. The current challenges of the field mainly involve processing large numbers of models and disentangling selection from demography. Several methods address the latter, either by estimating a neutral model from population structure or by inferring simultaneously environmental and demographic effects. Here we present Samβ\betaada, an integrated approach to study signatures of local adaptation, providing rapid processing of whole genome data and enabling assessment of spatial association using molecular markers. Specifically, candidate loci to adaptation are identified by automatically assessing genome-environment associations. In complement, measuring the Local Indicators of Spatial Association (LISA) for these candidate loci allows to detect whether similar genotypes tend to gather in space, which constitutes a useful indication of the possible kinship relationship between individuals. In this paper, we also analyze SNP data from Ugandan cattle to detect signatures of local adaptation with Samβ\betaada, BayEnv, LFMM and an outlier method (FDIST approach in Arlequin) and compare their results. Samβ\betaada is an open source software for Windows, Linux and MacOS X available at \url{http://lasig.epfl.ch/sambada}Comment: 1 figure in text, 1 figure in supplementary material The structure of the article was modified and some explanations were updated. The methods and results presented are the same as in the previous versio

    Diversity, genetic mapping, and signatures of domestication in the carrot (Daucus carota L.) genome, as revealed by Diversity Arrays Technology (DArT) markers

    Get PDF
    Carrot is one of the most economically important vegetables worldwide, but genetic and genomic resources supporting carrot breeding remain limited. We developed a Diversity Arrays Technology (DArT) platform for wild and cultivated carrot and used it to investigate genetic diversity and to develop a saturated genetic linkage map of carrot. We analyzed a set of 900 DArT markers in a collection of plant materials comprising 94 cultivated and 65 wild carrot accessions. The accessions were attributed to three separate groups: wild, Eastern cultivated and Western cultivated. Twenty-seven markers showing signatures for selection were identified. They showed a directional shift in frequency from the wild to the cultivated, likely reflecting diversifying selection imposed in the course of domestication. A genetic linkage map constructed using 188 F2 plants comprised 431 markers with an average distance of 1.1 cM, divided into nine linkage groups. Using previously anchored single nucleotide polymorphisms, the linkage groups were physically attributed to the nine carrot chromosomes. A cluster of markers mapping to chromosome 8 showed significant segregation distortion. Two of the 27 DArT markers with signatures for selection were segregating in the mapping population and were localized on chromosomes 2 and 6. Chromosome 2 was previously shown to carry the Vrn1 gene governing the biennial growth habit essential for cultivated carrot. The results reported here provide background for further research on the history of carrot domestication and identify genomic regions potentially important for modern carrot breeding

    The Population Genetic Signature of Polygenic Local Adaptation

    Full text link
    Adaptation in response to selection on polygenic phenotypes may occur via subtle allele frequencies shifts at many loci. Current population genomic techniques are not well posed to identify such signals. In the past decade, detailed knowledge about the specific loci underlying polygenic traits has begun to emerge from genome-wide association studies (GWAS). Here we combine this knowledge from GWAS with robust population genetic modeling to identify traits that may have been influenced by local adaptation. We exploit the fact that GWAS provide an estimate of the additive effect size of many loci to estimate the mean additive genetic value for a given phenotype across many populations as simple weighted sums of allele frequencies. We first describe a general model of neutral genetic value drift for an arbitrary number of populations with an arbitrary relatedness structure. Based on this model we develop methods for detecting unusually strong correlations between genetic values and specific environmental variables, as well as a generalization of QST/FSTQ_{ST}/F_{ST} comparisons to test for over-dispersion of genetic values among populations. Finally we lay out a framework to identify the individual populations or groups of populations that contribute to the signal of overdispersion. These tests have considerably greater power than their single locus equivalents due to the fact that they look for positive covariance between like effect alleles, and also significantly outperform methods that do not account for population structure. We apply our tests to the Human Genome Diversity Panel (HGDP) dataset using GWAS data for height, skin pigmentation, type 2 diabetes, body mass index, and two inflammatory bowel disease datasets. This analysis uncovers a number of putative signals of local adaptation, and we discuss the biological interpretation and caveats of these results.Comment: 42 pages including 8 figures and 3 tables; supplementary figures and tables not included on this upload, but are mostly unchanged from v

    Sparse reduced-rank regression for imaging genetics studies: models and applications

    Get PDF
    We present a novel statistical technique; the sparse reduced rank regression (sRRR) model which is a strategy for multivariate modelling of high-dimensional imaging responses and genetic predictors. By adopting penalisation techniques, the model is able to enforce sparsity in the regression coefficients, identifying subsets of genetic markers that best explain the variability observed in subsets of the phenotypes. To properly exploit the rich structure present in each of the imaging and genetics domains, we additionally propose the use of several structured penalties within the sRRR model. Using simulation procedures that accurately reflect realistic imaging genetics data, we present detailed evaluations of the sRRR method in comparison with the more traditional univariate linear modelling approach. In all settings considered, we show that sRRR possesses better power to detect the deleterious genetic variants. Moreover, using a simple genetic model, we demonstrate the potential benefits, in terms of statistical power, of carrying out voxel-wise searches as opposed to extracting averages over regions of interest in the brain. Since this entails the use of phenotypic vectors of enormous dimensionality, we suggest the use of a sparse classification model as a de-noising step, prior to the imaging genetics study. Finally, we present the application of a data re-sampling technique within the sRRR model for model selection. Using this approach we are able to rank the genetic markers in order of importance of association to the phenotypes, and similarly rank the phenotypes in order of importance to the genetic markers. In the very end, we illustrate the application perspective of the proposed statistical models in three real imaging genetics datasets and highlight some potential associations

    Geographical distribution of selected and putatively neutral SNPs in Southeast Asian malaria parasites.

    Get PDF
    Loci targeted by directional selection are expected to show elevated geographical population structure relative to neutral loci, and a flurry of recent papers have used this rationale to search for genome regions involved in adaptation. Studies of functional mutations that are known to be under selection are particularly useful for assessing the utility of this approach. Antimalarial drug treatment regimes vary considerably between countries in Southeast Asia selecting for local adaptation at parasite loci underlying resistance. We compared the population structure revealed by 10 nonsynonymous mutations (nonsynonymous single-nucleotide polymorphisms [nsSNPs]) in four loci that are known to be involved in antimalarial drug resistance, with patterns revealed by 10 synonymous mutations (synonymous single-nucleotide polymorphisms [sSNPs]) in housekeeping genes or genes of unknown function in 755 Plasmodium falciparum infections collected from 13 populations in six Southeast Asian countries. Allele frequencies at known nsSNPs underlying resistance varied markedly between locations (F(ST) = 0.18-0.66), with the highest frequencies on the Thailand-Burma border and the lowest frequencies in neighboring Lao PDR. In contrast, we found weak but significant geographic structure (F(ST) = 0-0.14) for 8 of 10 sSNPs. Importantly, all 10 nsSNPs showed significantly higher F(ST) (P < 8 x 10(-5)) than simulated neutral expectations based on observed F(ST) values in the putatively neutral sSNPs. This result was unaffected by the methods used to estimate allele frequencies or the number of populations used in the simulations. Given that dense single-nucleotide polymorphism (SNP) maps and rapid SNP assay methods are now available for P. falciparum, comparing genetic differentiation across the genome may provide a valuable aid to identifying parasite loci underlying local adaptation to drug treatment regimes or other selective forces. However, the high proportion of polymorphic sites that appear to be under balancing selection (or linked to selected sites) in the P. falciparum genome violates the central assumption that selected sites are rare, which complicates identification of outlier loci, and suggests that caution is needed when using this approach

    Tagging the signatures of domestication in common bean (<i>Phaseolus vulgaris</i>) by means of pooled DNA samples

    Get PDF
    Background and Aims: The main aim of this study was to use an amplified fragment length polymorphism (AFLP)-based, large-scale screening of the whole genome of Phaseolus vulgaris to determine the effects of selection on the structure of the genetic diversity in wild and domesticated populations. Methods: Using pooled DNA samples, seven each of wild and domesticated populations of P. vulgaris were studied using 2506 AFLP markers (on average, one every 250 kb). About 10 % of the markers were also analysed on individual genotypes and were used to infer allelic frequencies empirically from bulk data. In both data sets, tests were made to determine the departure from neutral expectation for each marker using an FST-based method. Key Results: The most important outcome is that a large fraction of the genome of the common bean (16 %; P &lt;0·01) appears to have been subjected to effects of selection during domestication. Markers obtained in individual genotypes were also mapped and classified according to their proximities to known genes and quantitative trait loci (QTLs) of the domestication syndrome. Most of the markers that were found to be potentially under the effects of selection were located in the proximity of previously mapped genes and QTLs related to the domestication syndrome. Conclusions: Overall, the results indicate that in P. vulgaris a large portion of the genome appears to have been subjected to the effects of selection, probably because of linkage to the loci selected during domestication. As most of the markers that are under the effects of selection are linked to known loci related to the domestication syndrome, it is concluded that population genomics approaches are very efficient in detecting QTLs. A method based on bulk DNA samples is presented that is effective in pre-screening for a large number of markers to determine selection signatures
    corecore