501 research outputs found

    Fregene: Simulation of realistic sequence-level data in populations and ascertained samples

    Get PDF
    Background: FREGENE simulates sequence-level data over large genomic regions in large populations. Because, unlike coalescent simulators, it works forwards through time, it allows complex scenarios of selection, demography, and recombination to be modelled simultaneously. Detailed tracking of sites under selection is implemented in FREGENE and provides the opportunity to test theoretical predictions and gain new insights into mechanisms of selection. We describe here main functionalities of both FREGENE and SAMPLE, a companion program that can replicate association study datasets.Results: We report detailed analyses of six large simulated datasets that we have made publicly available. Three demographic scenarios are modelled: one panmictic, one substructured with migration, and one complex scenario that mimics the principle features of genetic variation in major worldwide human populations. For each scenario there is one neutral simulation, and one with a complex pattern of selection.Conclusion: FREGENE and the simulated datasets will be valuable for assessing the validity of models for selection, demography and population genetic parameters, as well as the efficacy of association studies. Its principle advantages are modelling flexibility and computational efficiency. It is open source and object-oriented. As such, it can be customised and the range of models extended

    Inference of locus-specific ancestry in closely related populations

    Get PDF
    A characterization of the genetic variation of recently admixed populations may reveal historical population events, and is useful for the detection of single nucleotide polymorphisms (SNPs) associated with diseases through association studies and admixture mapping. Inference of locus-specific ancestry is key to our understanding of the genetic variation of such populations. While a number of methods for the inference of locus-specific ancestry are accurate when the ancestral populations are quite distant (e.g. African–Americans), current methods incur a large error rate when inferring the locus-specific ancestry in admixed populations where the ancestral populations are closely related (e.g. Americans of European descent)

    Probability that a chromosome is lost without trace under the neutral Wright-Fisher model with recombination

    Full text link
    I describe an analytical approximation for calculating the short-term probability of loss of a chromosome under the neutral Wright-Fisher model with recombination. I also present an upper and lower bound for this probability. Exact analytical calculation of this quantity is difficult and computationally expensive because the number of different ways in which a chromosome can be lost, grows very large in the presence of recombination. Simulations indicate that the probabilities obtained using my approximate formula are always comparable to the true expectations provided that the number of generations remains small. These results are useful in the context of an algorithm that we recently developed for simulating Wright-Fisher populations forward in time. C++ programs that can efficiently calculate these formulas are available on request.Comment: Additional Information, Padhukasahasram et al. 2008, Genetics, FORWSIM algorith

    AWclust: point-and-click software for non-parametric population structure analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Population structure analysis is important to genetic association studies and evolutionary investigations. Parametric approaches, e.g. STRUCTURE and L-POP, usually assume Hardy-Weinberg equilibrium (HWE) and linkage equilibrium among loci in sample population individuals. However, the assumptions may not hold and allele frequency estimation may not be accurate in some data sets. The improved version of STRUCTURE (version 2.1) can incorporate linkage information among loci but is still sensitive to high background linkage disequilibrium. Nowadays, large-scale single nucleotide polymorphisms (SNPs) are becoming popular in genetic studies. Therefore, it is imperative to have software that makes full use of these genetic data to generate inference even when model assumptions do not hold or allele frequency estimation suffers from high variation.</p> <p>Results</p> <p>We have developed point-and-click software for non-parametric population structure analysis distributed as an R package. The software takes advantage of the large number of SNPs available to categorize individuals into ethnically similar clusters and it does not require assumptions about population models. Nor does it estimate allele frequencies. Moreover, this software can also infer the optimal number of populations.</p> <p>Conclusion</p> <p>Our software tool employs non-parametric approaches to assign individuals to clusters using SNPs. It provides efficient computation and an intuitive way for researchers to explore ethnic relationships among individuals. It can be complementary to parametric approaches in population structure analysis.</p

    Responsible participation and housing: restoring democratic theory to the scene

    Get PDF
    Tensions between individual liberty and collective social justice characterise many advanced liberal societies. These tensions are reflected in the challenges posed for representative democracy both by participatory democratic practices and by the current emphasis on (so-called) responsible participation. Based on the example of ‘community’ housing associations in Scotland, this paper explores these tensions. It is argued that the critique of responsibility may have been over-stated – that, in particular, ‘community’ housing associations offer the basis for relatively more inclusive and effective processes of decision-making than council housing, which relies on the traditional processes and institutions of representative local government for its legitimacy

    A Bayesian method for evaluating and discovering disease loci associations

    Get PDF
    Background: A genome-wide association study (GWAS) typically involves examining representative SNPs in individuals from some population. A GWAS data set can concern a million SNPs and may soon concern billions. Researchers investigate the association of each SNP individually with a disease, and it is becoming increasingly commonplace to also analyze multi-SNP associations. Techniques for handling so many hypotheses include the Bonferroni correction and recently developed Bayesian methods. These methods can encounter problems. Most importantly, they are not applicable to a complex multi-locus hypothesis which has several competing hypotheses rather than only a null hypothesis. A method that computes the posterior probability of complex hypotheses is a pressing need. Methodology/Findings: We introduce the Bayesian network posterior probability (BNPP) method which addresses the difficulties. The method represents the relationship between a disease and SNPs using a directed acyclic graph (DAG) model, and computes the likelihood of such models using a Bayesian network scoring criterion. The posterior probability of a hypothesis is computed based on the likelihoods of all competing hypotheses. The BNPP can not only be used to evaluate a hypothesis that has previously been discovered or suspected, but also to discover new disease loci associations. The results of experiments using simulated and real data sets are presented. Our results concerning simulated data sets indicate that the BNPP exhibits both better evaluation and discovery performance than does a p-value based method. For the real data sets, previous findings in the literature are confirmed and additional findings are found. Conclusions/Significance: We conclude that the BNPP resolves a pressing problem by providing a way to compute the posterior probability of complex multi-locus hypotheses. A researcher can use the BNPP to determine the expected utility of investigating a hypothesis further. Furthermore, we conclude that the BNPP is a promising method for discovering disease loci associations. © 2011 Jiang et al

    Dense mapping of MYH9 localizes the strongest kidney disease associations to the region of introns 13 to 15

    Get PDF
    Admixture mapping recently identified MYH9 as a susceptibility gene for idiopathic focal segmental glomerulosclerosis (FSGS), HIV-associated nephropathy (HIVAN) and end-stage kidney disease attributed to hypertension (H-ESKD) in African Americans (AA). MYH9 encodes the heavy chain of non-muscle myosin IIA, a cellular motor involved in motility. A haplotype and its tagging SNPs spanning introns 12–23 were most strongly associated with kidney disease (OR 2–7; P < 10−8, recessive). To narrow the region of association and identify potential causal variation, we performed a dense-mapping study using 79 MYH9 SNPs in AA populations with FSGS, HIVAN and H-ESKD (typed for a subset of 46 SNPs), for a total of 2496 cases and controls. The strongest associations were for correlated SNPs rs5750250, rs2413396 and rs5750248 in introns 13, 14 and 15, a region of 5.6 kb. Rs5750250 showed OR 5.0, 8.0 and 2.8; P = 2 × 10−17, 2 × 10−10 and 3 × 10−22, respectively, for FSGS, HIVAN and H-ESKD; OR 5.7; P = 9 × 10−27 for combined FSGS and HIVAN, recessive. An independent association was observed for rs11912763 in intron 33. Neither the highly associated SNPs nor the results of resequencing MYH9 in 40 HIVAN or FSGS cases and controls revealed non-synonymous changes that could account for the disease associations. Rs2413396 and one of the highly associated SNPs in intron 23, rs4821480, are predicted splicing motif modifiers. Rs5750250 combined with rs11912763 had receiver operator characteristic (ROC) C statistics of 0.80, 0.73 and 0.65 for HIVAN, FSGS and H-ESKD, respectively, allowing prediction of genetic risk by typing two SNPs

    Design catalogue for eco-engineering of coastal artificial structures:a multifunctional approach for stakeholders and end-users

    Get PDF
    Coastal urbanisation, energy extraction, food production, shipping and transportation have led to the global proliferation of artificial structures within the coastal and marine environments (sensu “ocean sprawl”), with subsequent loss of natural habitats and biodiversity. To mitigate and compensate impacts of ocean sprawl, the practice of ecoengineering of artificial structures has been developed over the past decade. Eco-engineering aims to create sustainable ecosystems that integrate human society with the natural environment for the benefit of both. The science of eco-engineering has grown markedly, yet synthesis of research into a user-friendly and practitioner-focused format is lacking. Feedback from stakeholders has repeatedly stated that a “photo user guide” or “manual” covering the range of eco-engineering options available for artificial structures would be beneficial. However, a detailed and structured “user guide” for eco-engineering in coastal and marine environments is not yet possible; therefore we present an accessible review and catalogue of trialled eco-engineering options and a summary of guidance for a range of different structures tailored for stakeholders and end-users as the first step towards a structured manual. This work can thus serve as a potential template for future eco-engineering guides. Here we provide suggestions for potential eco-engineering designs to enhance biodiversity and ecosystem functioning and services of coastal artificial structures with the following structures covered: (1) rock revetment, breakwaters and groynes composed of armour stones or concrete units; (2) vertical and sloping seawalls; (3) over-water structures (i.e., piers) and associated support structures; and (4) tidal river walls

    Simultaneous Analysis of All SNPs in Genome-Wide and Re-Sequencing Association Studies

    Get PDF
    Testing one SNP at a time does not fully realise the potential of genome-wide association studies to identify multiple causal variants, which is a plausible scenario for many complex diseases. We show that simultaneous analysis of the entire set of SNPs from a genome-wide study to identify the subset that best predicts disease outcome is now feasible, thanks to developments in stochastic search methods. We used a Bayesian-inspired penalised maximum likelihood approach in which every SNP can be considered for additive, dominant, and recessive contributions to disease risk. Posterior mode estimates were obtained for regression coefficients that were each assigned a prior with a sharp mode at zero. A non-zero coefficient estimate was interpreted as corresponding to a significant SNP. We investigated two prior distributions and show that the normal-exponential-gamma prior leads to improved SNP selection in comparison with single-SNP tests. We also derived an explicit approximation for type-I error that avoids the need to use permutation procedures. As well as genome-wide analyses, our method is well-suited to fine mapping with very dense SNP sets obtained from re-sequencing and/or imputation. It can accommodate quantitative as well as case-control phenotypes, covariate adjustment, and can be extended to search for interactions. Here, we demonstrate the power and empirical type-I error of our approach using simulated case-control data sets of up to 500 K SNPs, a real genome-wide data set of 300 K SNPs, and a sequence-based dataset, each of which can be analysed in a few hours on a desktop workstation
    corecore